Saving Space to Increase Performance

Introduction

It’s easy to become complacent about saving disk space when
hard disk sizes keep growing and disk prices keep on dropping, but saving a few
bytes here and there can help improve SQL Server performance considerably.

If you have ever looked at an Execution
Plan
for a SQL Server query (and if you haven’t, you should!) you will see
that SQL Server produces an estimated “cost” for executing queries.
This cost is not in money terms, obviously, but in terms of computer resources
required to run the query. The primary component of this costing is disk I/O, so
it stands to reason that if we can reduce disk I/O then we reduce the cost of
executing a query, and therefore increase performance. In this article we will
look at a few ways of doing this.

Basics

Here is an fairly extreme example–I created two tables in
SQL Server 7 and loaded each with 10,000 4-byte strings.

create table t1 
   (v char(255) NOT NULL)

create table t2 
   (v varchar(255) NULL)

One table was created using the varchar(255) column type, and
the other using the char(255) type. Now, the char(255) type uses a fixed length
to store data, so if the string you store is less than 255 characters long, the
remaining space is wasted. This is not true with the varchar data type.

Running DBCC
SHOWCONTIG
against the tables showed that the table with fixed length
columns took up 334 pages (a page is 8 Kilobytes in SQL 7 or SQL 2000) of
storage space to store 10,000 rows. The version using varchar took up only 23
pages to store the same data. The reduced disk space means that any retrieval
operation, and particularly simple table-scanning operations such as SELECT
COUNT(1) FROM… run much quicker against the smaller table.

Although this is an extreme example, even small savings can
make a big difference. In the next example I re-created the two tables as
follows and loaded 10,000 rows into each. (To keep the examples simple I am
using one column in each table, but the same applies to tables with any number
of columns, it’s the total row length that matters)

create table t1 
   (v char(4000))

create table t2 
   (v char(4040))

Table t1, when loaded with 10,000 rows, took up 5000 pages of
disk space. The row length for table t2 is precisely one percent longer that t1,
so you might expect table t2 to take up only one percent more space, but it
actually takes up double the number of pages that t1 does. The reason for this
is that SQL Server 7 can store up to 8060 bytes of data on one page, so there is
plenty of room for two rows from table t1 on each page. However, when we extend
the row length to 4040, then only one row will fit on a page, hence we end up
using twice as many pages. SQL Server insists on fitting a whole row on a single
page, and will never try to save space by splitting a row across two pages.

Again, that was an extreme example, but as a general rule:

  • The shorter the row length, the more rows you will fit on a
    page, and the smaller your table will be.

The effect is even more noticeable in SQL Server 6.5, where
the maximum row length is slightly over 2000 bytes.

Some space saving hints:

  • Use varchar instead of char unless your data is almost
    always of a fixed length, or is very short anyway.
  • Using Unicode
    double-byte datatypes such as nchar and nvarchar take up double (Duh!) the
    space, so avoid them unless you really need them.
  • Use smallint and tinyint to save one or three bytes a time
    if you do not need the big numbers, and use
    integers instead of Float or Numeric
    wherever suitable.
  • Using smalldatetime
    instead of datetime saves two bytes, if accuracy to the nearest minute is
    good enough.
  • Avoid using GUID
    columns unless you really need them

These are just a few examples, and you should familiarise
yourself with the whole range of datatypes in SQL Server, and choose from them
very carefully. You might choose to use the smallmoney data type instead of the
money type to save 4 bytes a time, but the values this data type can handle are comparatively
small
, especially if you are dealing with currencies like Japanese Yen or
Italian Lira. If you choose a data type that you will eventually outgrow, then
this will cause more problems than it’s worth.

Index considerations

Remember that indexes also take up space, so if you keep your
indexes small (create only indexes that you are going to use, use short columns
in indexes, and refrain from using long compound indexes if possible) you can
improve performance in this way too.

Read up on the fillfactor
and pad_index
options for indexes. In general, SQL Server leaves blank space
in it’s indexes to allow for later additions, but if you are indexing a table
that never, or very rarely, changes, then you can adjust the fill factor to save
space and increase performance.

For tables that change more often, it’s important to do regular
table and index maintenance
to keep your data compact and efficiently
accessible.

Other benefits.

Keeping your data as compact as possible does not only reduce
the size of your data on disk, it provides other benefits too:

  • You can fit more data into your cache RAM, increasing your
    cache hit ratio and reducing disk I/O even further.
  • Smaller and faster backups.
  • Less traffic when moving data over the network.
  • Faster joins (short columns are easier to compare than long
    ones)

Further Reading

All the following subjects are well documented on Books
Online, and a Quick Guide to most of the topics can be found at my own home page.

  • SQL Server Datatypes
  • Estimating space usage
  • Choosing efficient indexes
  • Reading query execution plans
  • Table and Index maintenance
Neil Boyle
Neil Boyle
Neil Boyle left school at the age of sixteen thinking that computers were things that only existed in Star Trek. After failed careers as a Diesel Mechanic, Industrial Cleaner, Barman and Bulldozer Driver he went back to college to complete his education. Since graduating from North Staffs Poly he has worked up through the ranks from Trainee COBOL Programmer to SQL Server Consultant, a role in which he has specialised for the past seven years.

Latest Articles