Database Table and Field Naming Suggestions
December 7, 2009
I chose this topic because it sounded fun but I know it's the kind of thing that could start a holy war among developers and DBAs. After all, who am I to tell anyone how they should name tables and fields? Actually, I'm not the worst choice for advice on naming schemas. It's something I've thought a lot about over the last 14 years.
Some of my naming standards were handed to me on a silver platter by kindly developers with more experience than I had and some were earned the hard way. I can say from personal experience that it's better to receive these lessons free than earn them. Consider some examples. We'll start with field naming.
Field Naming: The Dos and Don'ts
Microsoft Access is a great product that allows you to do practically anything when it comes to naming. It's my opinion that Access was originally written with Excel power users in mind. I think that somewhere in the mid 90's it was hijacked by programmers and taken to levels that Microsoft never intended. It's precisely this feature, field naming, that makes me feel this way. Hard-core developers would never dream of naming things the way Access allows for.
Below is a screen shot of a table I created in Access, which I appropriately named BAD_table. It encapsulates a number of bad practices that I see all the time when I assist clients with Access database applications that were built in-house by people who could probably be described as Excel Power Users.
They know enough to be dangerous, but they don't know why. If you're an Excel Power User, you might be feeling the hairs on the back of your neck stand on end as you read this, but stick with it for another couple paragraphs and you'll understand what I mean. The problems that come from bad naming conventions don't manifest themselves until one begins writing code but by then it's too late to change them.
So, what's wrong with these field names?
There's technically nothing wrong with naming a field [ID] but once you start creating queries, you'll find that this is too ambiguous and you'll wish you'd named them differently. Consider this SQL statement.
SELECT * FROM tblContact c INNER JOIN tblEmployee e ON e.ID = c.EmployeeID INNER JOIN tblStatus s ON s.ID = c.StatusID
All three of these tables have a primary key field named [ID] but when these ID fields show up as foreign keys in other tables, their names must be changed. Thus you get links that read ID=EmployeeID and ID=StatusID instead of EmployeeID=EmployeeID and StatusID=StatusID. Some will write to tell me that this doesn't end up being as confusing as I'm making it out to be but I've been there, done that, and it's my suggestion that you avoid this approach.
There are potentially two things wrong with this field name. First, one might split this field into [ContactFirstName] and [ContactLastName], though an equally good argument might be made for keeping them together. It's just that sometimes you might want to display a name as First-Last and sometimes as Last-First. If you use only one field for the full name, you forfeit that flexibility.
The second problem, which we see a lot of, is the embedded space. Access allows you to do this, even though you will curse the day you added spaces once you start coding. It may seem like a little thing, but I've written so many thousands of lines of code that I start looking for shortcuts when typing. If you have spaces, you'll simply have to do more typing.
For example, consider this code for accessing fields in a DAO recordset:
sSQL = "SELECT FirstName, [Birth Date] FROM tblContacts" Set rst = dbs.OpenRecordset(sSQL, dbOpenSnapshot) Debug.Print rst!FirstName Debug.Print rst![Birth Date]
The extra typing I'm referring to is the set of square brackets. You must add them to both the SQL statement and to the recordset field assignment. It gets more funky when you start creating forms with the wizard, which will name the associated text box Birth Date like the field. At least it appears to. Behind the scene, it uses an underscore for the space, so the control is referenced as Birth_Date. This can be seen in the event sub procedure assignments. For example, the after update event of the [Birth Date] field would be ...
Private Sub Birth_Date_AfterUpdate()
I guess the point is that it's not necessary to add spaces and contributes nothing except more typing. Access allows you to supply a caption for a field, which automatically provides for a readable alias when the field is added to a query. If you want spaces, update the Caption property but don't put them in the actual field names.
[# of Cars] and [Got Milk?]
Everything mentioned above about spaces applies equally well to special characters, although there are some special characters that Access won't allow you to use, namely . ! [ ]. These characters mean something in SQL Statements. While the pound sign and question mark are not operators and may be used, they present other problems, as does starting a field with a number.
Below is a screen shot of a table I tried to create in SQL Server. Notice that I succeeded in creating a field that begins with a number, namely [1stField]. The brackets were added automatically by SQL Server as a quiet protestation to the name. In fact, SQL Server 7 didn't even allow field names that began with a number. In my example below, [Got Milk?] is allowed with the question mark provided the field is bracketed but it choked on the pound sign.
So while Access will allow me to name a field as [# of Cars], SQL Server will not. That means that in addition to making my code more complicated, this Access table cannot be upsized to SQL Server, should that whim ever strike me.
[Last time he/she went to a Rockie's baseball game]
My spell checker now tells me that Rockie's isn't correct, which is the first thing wrong with this field name. J The next is the spaces and special characters. In fact, I'm not sure why a single quote is allowed. That's going to wreak havoc in SQL Statements, I'm sure. I don't like the foreword slash either, though I can't say exactly what harm it may or may not do.
The real problem is the length of this field name. Somehow, there has to be a balance between too much abbreviation and too many words. Do you really want to have to type a field name that big? On the other hand, simply removing the spaces and vowels is no better. Who could decipher a field named LstTmHShWntTRckysBsbllGm. In my GOOD_table below, you see I renamed this field to BaseballNote. That's a good compromise: self-explanatory enough and not too verbose.
[Name] and [Date]
For these last two, it's probably not so obvious as to why I object. Try this as an experiment. Open any Access database and press Ctl+G to bring up the immediate window. (Bottom of the VBA environment window.) Now type a question mark and one of these field names, and then hit the ENTER key. Below are the results of these two field names, and other potentially poor choices.
?Name Microsoft Access ?Date 7/30/2009 ?Now 7/30/2009 9:42:20 AM ?CurrentUser Admin
These words represent built-in functions within Microsoft Access. The Date() function returns today's date. Now while you are allowed to create a field named [Date] it means you might end up writing SQL that looks like this ...
SELECT * FROM tblOrders WHERE Date=Date()
Does this SQL break? Not that I know of but it lends itself to ambiguity. The question is, why would you want a field named [Date]? What date is it? OrderDate or ShipDate? If you have a field named [Name], what name is it? Person name or product name or what?
Below is a screen shot of the corresponding GOOD_table. The ID field is not ambiguous, spaces are removed and no special characters were used. The field names are descriptive, yet short enough to be reasonable when typed. There's one other subtle but important difference between this screen shot and the previous table image. Can you see it?
Here's where I have to give credit to a great programmer-friend of mine, Jim Pilcher. It was nearly a decade ago when I did some work for him that he introduced me to his rule-of-thumb for sizing text fields, and I'm going to share it with you here.
Notice above in the BAD_table that the [Contact Name] field is 50 characters. What this tells you is that I accepted the Access default size and probably didn't think about what size it should really be. Now perhaps 50 characters represents 20 for First Name and 30 for Last Name, or maybe 25 for each ... or maybe I just didn't think about it at all. There's no way to know for sure.
Jim's convention when choosing field size is to use a binary value, 2 to the power of something. With this plan, field sizes will always be in this list: 1, 2, 4, 8, 16, 32, 64, 128, 255. (Access doesn't allow fields of size 256 because it needs the last bit for its own purposes.) I've added 48 and 96 to that list even though they are not powers of 2, because I like those sizes for certain fields. Have you figured out the advantage yet?
There are two benefits to be had with this sizing convention. First, as Jim pointed out, any time he sees a field in a table that does not have one of these sizes, he knows he did not personally add it. Maybe a consultant did, maybe the client, but certainly not him. It's like the parakeet in the coal mine.
Second, it lends itself to consistency. I've set the [FirstName] field size to 16 characters. I've found that to be adequate, knowing that 8 characters is going to be too small and 32 is overkill. Next time I create a [FirstName] field, I will go through the same process and I'll end up at 16 and my next [FirstName] field will be consistent with my last one. Same for [City], which is always 32 and [Email] which is always 64 characters. Consistency! This naming tends to make my field sizing more consistent.
For what it's worth, choosing powers of 2 is arbitrary. One could just as easily use a Fibonacci number series ( 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233) and accomplish the same goals.
Much of what has already been written about field names applies to tables. One thing I omitted in the discussion above is the use of capital letters. I personally prefer to use camel case, with the first letter of words jutting up like camel humps. Some (read Oracle developers here) choose to use all CAPITALS separated by underscores.
This, even more so than conventions already discussed, is a personal matter. However, WE ALL KNOW WHAT IT'S CALLED WHEN YOU WRITE AN EMAIL WITH ALL CAPITAL LETTERS: SHOUTING! Please don't use all capital letters when naming fields and tables ... I may have to work on your database some day and it offends me.
The next thing to consider is whether or not you will use table prefixes. Will your table be named tblEmployee or Employee? I actually don't have a strong feeling about this, but I guess I tend to prefix the table. Maybe because I don't like any ambiguity, and I sometimes use the attribute key word as an alias in my SQL statements, like this ...
SELECT FName + ' ' + LName AS Employee FROM tblEmployee
Finally, we must mention the question of plurals. Is the table tblEmployee or tblEmployees? One record of the table is one employee so the whole table is all the employees. My first inclination is to make table names plural. However, English sometimes sucks and the singular word "Category" does not become "Categorys" but "Categories". "Status" becomes "Statuses". I hate this inconsistency so much that I tend to go singular with table names.
Thanks for Letting Me Rant
Yes, this has been a fun article for me because I love to rant. In the end, though, it's up to you to decide how you will name your fields and tables. Honestly, in the long run, it probably isn't going to matter that much. However, my suggestion is to do yourself a favor and remove spaces & special characters to save on typing. Come up with a system for sizing fields consistently, don't make names too long and your tables will be fine.