Fixing A Stupid Naming Convention

I’m going to talk about one of the most prevalent and stupid naming conventions in data modeling. This is the convention of incorporating the table name into the column name. Here’s an example:

create
table customer
( customerid datatype
, customername datatype
, customerbirthdate datatype
);

Why is this convention used? Because of a stupid SQL coding convention called table name aliasing. Here’s an example of its use:

select a.customerid
, a.customername
, a.customerbirthdate
from customer a;

The genius who is incorporating the table names into the column name is doing so because he is using table name aliasing. It would be just as effective and more intelligent to do the following:

create
table customer
( id datatype
, name datatype
, birthdate datatype
);

select customer.id
, customer.name
, customer.birthdate
from customer;

No table name aliasing is used and the same information is conveyed. Not only that, but your data models are less verbose. In otherwords, don’t use table name aliasing at all. It is a useless feature that conceals the uniqueness of your columns and leads otherwise intelligent data modelers to introduce redundant elements into column names. The only time you should incorporate a table name into a column is when it represents a foreign key.

Here’s a data model using the convention I recommend:

plandatamodel2.jpg

Notice the naming of foreign keys. And here’s a SQL query using the model and the a more acceptable table aliasing convention:


select plantype.code
, plansubtype.code
, frequency.code
, coverage.code
, premiumtype.code
, coveragepremium.amount
from plantype pt
, plansubtype pst
, plan p
, frequency f
, planfrequency pf
, coverage c
, frequencycoverage fc
, premiumtype pt
, coveragepremium cp
where plantype.id = p.plantype_id
and plansubtype.id = p.plansubtype_id
and plan.id = pf.plan_id
and frequency.id = pf.frequency_id
and coverage.id = fc.coverage_id
and planfrequency.id = fc.planfreqency_id
and premiumtype.id = cp.premiumtype_id
and frequencycoverage.id = cp.frequencycoverage_id;

As you can see, not using table aliases does not lead to any loss of information and additionally it is plainly apparent when you are using primary keys and foreign keys. It is time to abandon the stupid naming convention of incorporating table names into non foreign key columns.

Advertisements
Posted in Uncategorized. Tags: , , . 5 Comments »

5 Responses to “Fixing A Stupid Naming Convention”

  1. Arash Mirdamadi Says:

    I couldn’t agree with your convention more. I am not a database expert but from what I have seen the table+entity name convention is redundant in many ways:

    1) If someone is looking at the relation from the table point of view, then obviously the name of the table is known.
    2) As you mentioned, in my most queries, such extra information appended to the entity name could cause convolution and makes it very difficult for the reader to follow the statement especially if it consists of multiple-nested complex queries.

  2. Jason Says:

    I agree that it is silly to prefix all columns with the table name; however I must disagree on two points:

    1. Foreign key columns

    Columns that contain data specific to a table (i.e. customer_first_name) do not need a table prefix. Local attributes do not need globally unique names.

    Columns that contain shared data – like customer_id – should have names that are globally unique.

    The customer_id’s scope is not local to the customer table. Having the “customer_” prefix is not some method of making identification easier when writing queries – it just makes sense.

    Additionally, relying on the fk prefix to identify the dimension table can lead to some nasty confusion if you change table names.

    If you decide to rename the CUSTOMER table to GUEST, and the Id column has no prefix, you can no longer tell what table “customer_id” refers to unless you already know.

    Prefixing table names to key columns is not redundant – it greatly improves the maintainability of the data structure.

    #2. Not using table aliases (or referring to tables by their full name)

    Aliases should always be used in multi-table joins, but they should be meaningful, and they should NOT be the same as the table name.

    Many database systems will include tables referenced in the select list that are not in the from clause, but with a product join. If a programmer makes a spelling error, the query may still return but provide incorrect results.

    Additionally, many database systems require tables to be aliased in order to use ANSI SQL join syntax. Since all ANSI-syntax supporting databases permit the use of aliases, your code is much more portable if you use the aliases.

    Performing joins using a table list and where clause is much like using tables for web page layout — it is deprecated. ANSI joins are the CSS of query writing.

    Self-joins and derived tables require aliases.

    Finally, if you change the name of a table, you should not have to change every line of code that references any piece of information from that table.

    Suggesting that people should not use table aliases in SQL is analogous to saying that people should not use functions in C, or classes in C++.

  3. grant czerepak Says:

    Thank you for your well thought out reply, Jason.

    Your argument for global uniqueness in non-local attributes has merit.

    I also especially liked your description of ANSI joins as the CSS of query writing.

    I took an extreme stance on table aliasing because I wanted to emphasize the worth of not using table names in local attributes.

    I also see how portability would put table aliasing on the same level as C functions or C++ classes.

    Again, well thought out and thank you.

  4. Tim Says:

    Your select statement:

    select plantype.code
    , plansubtype.code
    , frequency.code
    , coverage.code
    , premiumtype.code
    , coveragepremium.amount


    is not very useable outside of the database. You get 5 columns named ‘code’ and one named ‘amount’. For implementation into any data aware system, you would need to specifiy ‘AS plantypecode’, ‘AS plansubtypecode’, etc. And you might as well have column names that are unique (not to mention descriptive) instead of typing an AS for every non-unique name. This is the main reason I use longDescriptiveColumnNames: to aid in data binding. And what good is data if you don’t know the schema?

    ..just a thought.

  5. grant czerepak Says:

    Thanks for your thought, Tim.

    Actually, this is part of the reason I am for an alternate architecture to SQL databases. I suggest taking a look at the associative model of data which I discuss here.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: