Package DB Schema v3

Toshio Kuratomi a.badger at gmail.com
Tue Dec 5 23:04:13 UTC 2006


On Tue, 2006-12-05 at 23:10 +0100, Karel Zak wrote:
> On Tue, Dec 05, 2006 at 11:32:29AM -0800, Toshio Kuratomi wrote:
> > >  status     text not null default 'D'
> > >             check ( status IN ('D','A','M','E','R') )
> > > 
> > What's the justification?  Here's my reasoning:
> > [Single letter code]
> > + Harder to misspell
> > + Takes less storage space
> 
>   + Takes less space for data transfer between client and DB engine
>  
> > [Full words]
> > + Less cryptic
> 
>  cryptic? Who will be typical client for your DB?  Human or script? ;-)
> 
Humans :-)

If we only keep codes in the database, then application level code has
to translate that into something human readable.  Then you end up with
duplicate code when more than one application is utilizing the database.
It's much better to keep the information somewhere in the database.

> > We could solve both concerns by having a foreign key constraint into a
> > table with the valid status phrases for each table that needs statuses
> 
>  Yes. It's a good way *in case* you need modify/add status types,
>  otherwise it's over engineering. (FK is nothing cheap for DB engine.)
> 
Right.  I'm not projecting needing to modify status types too
frequently, so I figured that FK would be overkill.

Jeffrey Ollie brought up some l10n and i18n ideas wrt this, though.
I'll pull that discussion back into this thread.

> > but that makes things more complex.  This would allow us to select the
> > list of valid statuses from a database table which is a plus.  But it
> > would also require a join for the common case of giving a human readable
> > name for the status.  So I'd like to hear your justification.
> > 
> > > > 
> > > > create table Branch (
> > > >   collectionId integer not null primary key,
> > > >   branchName varchar(32) not null,
> > >     ^^^^^^^^^
> > > >   distTag varchar(32) not null,
> > >     ^^^^^^
> > >   is it right define duplicate tags and branch names?
> > > 
> > distTag and branchName are different pieces of data.
> > From the comments:
> > -- :branchName: Name of the branch in the VCS ("FC-3", "devel")
> > -- :distTag: DistTag used in the buildsystem (".fc3", ".fc6")
> > 
> > It may be unfortunate that we don't use the same names for both, but
> > it's the way things are.
> 
>  Well, the question is: is it expected that there will be same DistTag
>  for different branches? If not.. define distTag as UNIQUE.
> 
Ah.  I understand you now.  Yes, both distTag and branchName can be
unique.  I'll change that.

> > > >   parentId integer null,
> > > >   foreign key (parentId) references Collection(id),
> > > >   foreign key (collectionId) references Collection(id)
> > > > );
> > > 
> > >  Hmm.. here I see 1:1 model (PK=FK). Strange. (It usually means that
> > >  you should merge the tables to one table only.) Maybe:
> > > 
> > >  create table Branch (
> > >     id serial primary key,
> > >     branchName varchar(32) not null unique,
> > >     distTag varchar(32) not null unique,
> > >     parentId integer references Collection(id),
> > >     collectionId integer not null references Collection(id)
> > >  );
> > > 
> > We're actually modeling a 1:[0 1] relationship.  For programmers, this
> > would be inheritance.
> 
>  Hint: never think about programmers and DB usage when you work on DB
>  design. Think about data and relatioships between data only :-)
> 
Generally good.  But in this case we're working with an Object
Relational Mapper so we have to pay some attention to the programmer as
well.  Additionally, we *are* dealing with relationships between data
here, it's just that it's 1:[0 1] which is covered much less frequently
than 1:1 (mostly avoid), 1:many, and many:many (transform into 1:many)

>  I don't have time to study data and real relatioships for this DB, but
>  1:1 is usually very strange. (and 1:0 = PK:NULL in normal table).
> 
Right, 1:1 is usually not necessary.  1:0 makes no sense :-)  This is
neither of those.  1:[0 1] does not make PK = NULL.  It means that there
will be some records in Collection for which there is also a record in
Branch and some for which no branch record exists at all.

PostgreSQL and SQL-1999 both have ways of modeling this.  But
PostgreSQL's has some caveats that we might run into and neither
PostgreSQL nor MySQL support the SQL-1999 construct.

Here's some documentation on PostgreSQL's inheritance feature:
http://www.postgresql.org/docs/8.1/interactive/ddl-inherit.html

and some documentation on the interface with the ORM:
http://www.sqlalchemy.org/docs/adv_datamapping.myt#advdatamapping_inheritance

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.fedoraproject.org/pipermail/infrastructure/attachments/20061205/a47c657b/attachment.bin 


More information about the infrastructure mailing list