The basic approach to naming is to choose meaningful, memorable, and descriptive readable names, avoiding abbreviations and codes, and using underscores either consistently or not at all. In a large application, table, column, and data names will often be multi-word, as in the case of REVERSED_EXPENSE_ACCOUNT or LAST_GL_CLOSE_DATE. The goal of thoughtful naming methods is ease of use: The names must be easy to remember and must follow rules that are easily explained and applied. In this article, a somewhat more rigorous approach to naming is presented, with the ultimate goal of developing a formal process of object name normalization.
In a relational database system, the hierarchy of objects ranges from the database, to the table owners, to the tables, to the columns, to the data values. In very large systems, there may even be multiple databases, and these may be distributed within locations. For the sake of brevity, the higher levels will be ignored for now, but what is said will apply to them as well.
Each level in this hierarchy is defined within the level above it, and each level should be given names appropriate to its own level and should not incorporate names from outside its own level. For example, a table cannot have two columns called NAME, and the account named George cannot own two tables named AUTHOR.
There is no requirement that each of George’s tables have a name that is unique throughout the entire database. Other owners may have AUTHOR tables as well. Even if George is granted access to those tables, there is no confusion because he can identify each table uniquely by prefixing its owner’s name to the table name, as in DIETRICH.AUTHOR. It would not be logically consistent to incorporate George’s owner name into the name of each of his tables, as in GEOAUTHOR, GEOBOOKSHELF, and so on. This confuses and complicates the table name by placing part of its parent’s name in its own, in effect a violation of level-name integrity.
Brevity should never be favored over clarity. Including pieces of table names in column names is a bad technique, because it violates the logical idea of levels and the level-name integrity that this requires. It is also confusing, requiring users to look up column names virtually every time they want to write a query. Object names must be unique within their parent, but no incorporation of names from outside an object’s own level should be permitted.
The support for abstract datatypes in Oracle strengthens your ability to create consistent names for attributes. If you create a datatype called ADDRESS_TY, it will have the same attributes each time it is used. Each of the attributes will have a consistent name, datatype, and length, making their implementation more consistent across the enterprise. However, using abstract datatypes in this manner requires that you do both of the following:
- Properly define the datatypes at the start so you can avoid having to modify the datatype later.
- Support the syntax requirements of abstract datatypes.
The one area of difficulty with using brief column names is the occasional appearance of a foreign key in a table in which another column has the same name that the foreign key column has in its home table. One possible long-term solution is to allow the use of the full foreign key name, including the table name of its home table, as a column name in the local table (such as BOOKSHELF.TITLE as a column name).
The practical need to solve the same-name column problem requires one of the following actions:
- Invent a name that incorporates the source table of the foreign key in its name without using the dot (using an underscore, for instance).
- Invent a name that incorporates an abbreviation of the source table of the foreign key in its name.
- Invent a name different from its name in its source table.
- Change the name of the conflicting column.
None of these is particularly attractive, but if you come across the same-name dilemma, you’ll need to take one of these actions.
One area of great inconsistency and confusion is the question of whether objects should have singular or plural names. Should it be the AUTHOR table or the AUTHORS table? Should it be the NAME column or the NAMES column?
There are two helpful ways to think about this issue. First, consider some columns common to nearly every database: NAME, ADDRESS, CITY, STATE, and ZIP. Other than the first column, does it ever occur to anyone to make these names plural? It is nearly self-evident when considering these names that they each describe the contents of a single row—a record. Even though relational databases are “set oriented,” clearly the fundamental unit of a set is a row, and it is the content of that row that is well-described by singular column names. It is simply more intuitive and straightforward to restrict column names to singular.
If all objects are named consistently, neither you nor a user has to try to remember the rules for what is plural and what isn’t. The benefit of this should be obvious. Suppose we decide that all objects will henceforth be plural. We now have “s” or “es” on the end of virtually every object, perhaps even on the end of each word in a long multi-word object name. Of what possible benefit is it to key all these extra letters all the time? Is it easier to use? Is it easier to understand? Is it easier to remember? Obviously, it is none of these.
Therefore, the best solution is this: All object names are always singular. The sole exception to this rule is any widely accepted term already commonly used in the business, such as “sales.”
As mentioned earlier, clarity should never be sacrificed for brevity, but given two equally meaningful, memorable, and descriptive names, always choose the shorter. During application development, propose alternative column and table names such as these to a group of users and developers and get their input on choosing the clearest name. How do you build lists of alternatives? Use a thesaurus and a dictionary. On a project team dedicated to developing superior, productive applications, every team member should be given a thesaurus and a dictionary as basic equipment and then should be reminded over and over again of the importance of careful object naming.
Object Name Thesaurus
Ultimately, relational databases should include an object name thesaurus, just as they include a data dictionary. This thesaurus should enforce the company’s naming standards and ensure consistency of name choice and abbreviation (where used).
Such standards may require the use of underscores in object naming to make the parsing of the name into component parts a straightforward task. This also helps enforce the consistent use of underscores, rather than the scattered, inconsistent usage within an application that underscores frequently receive now.
If you work directly with a government agency or large firm, that organization may already have object-naming standards. The object-naming standards of large organizations have over the years radiated into the rest of the commercial marketplace, and they may form the basis for the naming standards used at your company. For example, those standards may provide the direction to choose between “corporation” and “firm.” If they do not, you should develop your naming standards to be consistent, both with those base standards and with the guidelines put forth here.