When designing databases in the 1960s and 1970s, I was taught to be especially careful of how much space each data element consumed. Since hard disks were tiny by today's standards (the IBM 360 125 came with 7.25Mb to 100Mb drives),13 I was hard-pressed to minimize the amount of data stored in each "record." I economized by "coding" whenever and wherever I could. For example, a single column (byte) might contain several different types of data, depending on the value to be stored. When SQL Server and other relational databases were introduced, disk space was still expensive, but not nearly as much as in the mainframe days. However, more experienced database architects still choose column widths based on past experience and with the knowledge that more data means poorer performance.
In situations where you need to store data in an "international" character set, whose characters are not supported by the ANSI set, you'll have to define your columns (and string literals) as Unicode. If you take this option, SQL Server stores 16 bits for each character instead of 8. It means the same four-character entry requires 4 bytes in ANSI and 16 bytes in Unicode columns. Just remember to prefix your string literals with "N", as in N'Fred', when building Unicode expressionsVisual Studio tools and wizards do this for you if they generate the query. There is another aspect to Unicode that might surprise you. When you define a column as nvarchar, you specify a maximum length, as shown in Figure 3.8.
Figure 3.8 Creating a table with a Unicode column.
This DDL code allocates 50 bytes of space in the data row to the Author's name. However, this also means that the name must be no longer than 25 (16-bit) characters.
Char vs. VarChar
I've heard the debates over use of fixed-length datatypes (like char and nchar) over variable-length types (like varchar and nvarchar). In my practices, I rarely use the fixed-width types because they're problematic in a number of respects. These types are fine for columns whose data is always the same number of characters, but if you slip and provide a value that's shorter (or longer) than the defined size, SQL Server either pads the remaining space or truncates the data (often without notice). You'll also find that it's tough to create expressions against fixed-length columns unless you match the length of both operands. For example, if your fixed-length column can contain four characters, you'll have to write an expression that has exactly four characters, or an equality expression will always return False.
IF MyFixedCol = 'Fred'
This returns TRUE if MyFixedCol contains "Fred".
IF MyFixedCol = 'Fred '
But this returns FALSE.
For this reason (and others), I prefer to use variable-length types. They don't consume much extra space (if any) and when the data length varies, this approach can actually save space. When defining variable-length character columns, you specify the maximum amount of space to reserve for the column. This does not preallocate this spaceit simply sets an upper limit. With SQL Server 2005, you can now define a varchar(max) or nvarchar(max) column that (like the TEXT datatype) can store up to 231 bytes and Unicode 230 bytes.
Decimal vs. Floating Point
When you record a money value in the database, it's best to understand the nature of the values you intend to storeespecially the precision. For those of you that took computer science in school, you know that it's not possible to store some values in binary. For example, [1/3] is stored as .3333 (with a never-ending list of "3"s.) While the value might be close, if you add [1/3] + [1/3] + [1/3], you'll get .9999you've lost some precision. Sure, with rounding, the result is returned as 1, but in some cases, you aren't permitted to round.
IMHO - In the early days of computing, clever programmers were able to strip off the extra precision (values less than a penny) and salt it away in another account. By the end of the week, they had accumulated a tidy sumespecially when millions of dollars were changing hands.
When you store a money value, be sure everyone knows the currency on which this value is based. This can help you from making a mistake when bidding on a project in the U.K., where the dollar is worth (at today's rate) about £0.529269. You might consider using a CLR-based user-defined type to keep the currency type stored with the valueespecially if a single column can hold values from more than one currency.
The decimal datatype is listed (as shown in Table 1.1) under "Exact Numerics". That is, it's designed to hold an exact value. When you declare a decimal or numeric (they are equivalent), you also can declare the precision and scale (it defaults to 18). The precision is the maximum total number of decimal digits that can be storedincluding the values on either side of the decimal point. To store a value of 1234.1234, you would need a precision of 8.
The scale indicates the maximum number of decimal digits that can be stored to the right of the decimal pointthis must be a value from 0 to the defined precision. The default scale is 0, so unless you define a scale, your value will be stored as a whole number (without a decimal portion). You won't be able to define a scale unless you define a precision as well. For example (as shown in Figure 3.9), to define a column with a precision of 10 and four decimal places, you would code:
Figure 3.9 Declaring a decimal column with specific precision and scale.
Working with Imprecise Numbers
When working with scientific data where you need more precision but not 100% accuracy (which sounds a bit strange), you can choose the approximate number data types. Sure, some numbers can be expressed exactly, but others can't due to binary round-off. In the case of the float datatype, you can define the precision and storage size by providing a value that determines the number of bits used to store the mantissa14 of the floating point number (in scientific notation). If you supply a value between 1 and 24, the float's precision is set to 7, and it takes 4 bytes to store the value. If you provide a value between 25 and 53, the float's precision is set to 15, and it takes 8 bytes to store the value. The default is 53. Note that SQL Server 2005 resets the mantissa setting to either 1 or 53, based on the value you supply.
14 Mantissa: the fractional part of a floating-point number.
Table 3.1 SQL Server Datatypes and Their Precision
Datatype
Bytes
Exact Numerics
These values are stored so the value stored is expressed exactlythey are not subject to binary round-off.
Integers
bigint
8
Integer (whole number) data from 2^63 (9223372036854775808) through 2^631 (9223372036854775807).
int
4
Integer r(whole number) data from 2^31 (2,147,483,648) through 2^31 1 (2,147,483,647).
smallint
2
Integer data from 2^15 (32,768) through 2^15 1 (32,767).
tinyint
1
Integer data from 0 through 255.
Bit
bit
1
Integer data with either a 1 (True), 0 (False), or NULL value.
Decimal
decimal
517
Fixed precision and scale numeric data from 10^38 +1 through 10^38 1.
numeric
Functionally equivalent to decimal.
Money
money
4
Monetary data values from 2^63 (922,337,203,685,477.5808) through 2^63 1 (+922,337,203,685,477.5807), with accuracy to a ten-thousandth of a monetary unit.
smallmoney
8
Monetary data values from 214,748.3648 through +214,748.3647, with accuracy to a ten-thousandth of a monetary unit.
Approximate Numerics
These values are stored in binary and are used when a precise but not 100% accurate value must be stored.
float
48
Floating precision number data from 1.79E + 308 through 1.79E + 308.
doubleprecision
8
Equivalent to float(53) (8 bytes).
real
4
Floating precision number data from 3.40E + 38 through 3.40E + 38.
Dates
datetime
8
Date and time data from January 1, 1753, through December 31, 9999, with an accuracy of three-hundredths of a second, or 3.33 milliseconds.
smalldatetime
4
Date and time data from January 1, 1900, through June 6, 2079, with an accuracy of 1 minute.
ANSI Character Strings
These values are stored as strings of characters in non-Unicode (ANSI) encoding (8-bits/character).
char
N
Fixed-length non-Unicode character data with a maximum length of 8,000 characters.
varchar
N
Variable-length non-Unicode data with a maximum of 8,000 characters.
varchar(max)
N
Variable-length non-Unicode data with a maximum length of 2^31 1 (2,147,483,647) characters.
text
N
Variable-length non-Unicode data with a maximum length of 2^31 1 (2,147,483,647) characters.
Unicode Character Strings
These values are stored in Unicode (16-bits/character).
nchar
N
Fixed-length Unicode data with a maximum length of 4,000 characters; 16 bits stored for each character.
nvarchar
N
Variable-length Unicode data with a maximum length of 4,000 characters.
sysname
128
System-supplied user-defined data type that is functionally equivalent to nvarchar (128)and is used to reference database object names.
nvarchar(max)
Variable-length Unicode data with a maximum length of 2^30 1 (1,073,741,823) characters.
ntext
N
Variable-length Unicode data with a maximum length of 2^30 1 (1,073,741,823) characters.
Binary Strings
These values are stored in binary with any attempt to encode them.
binary
N
Fixed-length binary data with a maximum length of 8,000 bytes.
varbinary
N
Variable-length binary data with a maximum length of 8,000 bytes.
varbinary(max)
Variable-length binary data with a maximum length of 2^31 1 (2,147,483,647) bytes.
image
N
Variable-length binary data with a maximum length of 2^31 1 (2,147,483,647) bytes.
Other Types
cursor
A reference to a server-side CURSOR.
sql_variant
N
A data type that stores values of various SQL Server-supported data types, except text, ntext, timestamp, and sql_variant.
table
A special data type used to store a rowset for later processing.
timestamp
8
A database-wide unique number that gets updated every time a row gets updated.
uniqueidentifier
16
A globally unique identifier (GUID).
xml
N
Names an XML schema collection. Can store up to 2GB of data.
Using the xml Datatype
For the first time, SQL Server 2005 introduces the new xml datatype. This means you're going to be able to store XML data in your table's column(s). Because xml is a "real" built-in type, you'll be able to use it when creating a table as a variable type, a parameter type, or a function return type. You'll also be able to use it in CAST or CONVERT. That said, I need to discuss where it makes sense to use xml typed data columns or xml typed arguments. One interesting use would permit you to pass lists of values to be used in an IN expression. Yes, you would need to write a function to convert this to a table-type variable.
Using the sql_variant Datatype
One of SQL Server 2000's innovations was "lifted" from Visual Basicthe "variant." The sql_variant datatype is unusual, in that it's designed to "morph" itself to most (non-BLOB) types. This means when you define a column as sql_variant, it can contain an integer (of any size), a string, a float, money, or even an xml structure. The sql_variant column value does not take on a type until you assign a value to it. I suggest you check out BOL for the rules and regulations involving this unique type.
Summary
This chapter gives you a kick-start on designing relational databases that can perform better, be easier to maintain, and be more successful. That's usually an unspoken goal of any database application project. Unless it's successful, you'll either be back working on it when you should be home relaxing or be out looking for another jobwithout a good recommendation from your last employer. I talked about both formal rules and informal suggestions to normalize your database. Understand that few of these rules are cast in stone, but until you fully understand them, the databases you (and your team) design and implement, populate and test, and polish and deploy won't keep bread on the table.
Buy this book Hitchhiker's Guide to Visual Studio and SQL Server : Best Practice Architectures and Examples By William R. Vaughn with Peter Blackburn Published by Addison-Wesley Professional Series: Microsoft Windows Server System Series ISBN: 0321243625; Published: 11/8/2006; Copyright 2007; Pages: T; Edition: 7 www.awprofessional.com/title/0321243625
Add databasejournal.com to your favorites Add databasejournal.com to your browser search box IE 7 | Firefox 2.0 | Firefox 1.5.xReceive news via our XML/RSS feed