Choosing the Right Disk Subsystem for your SQL Server

May 14, 2000

When comes the time of implementation, the choice of the server is crucial. Generally, the IT Team focus on processor speed, amount of RAM, amount of disk, security, but few IT Teams choose the right disk subsystem for their SQL Server. This paper is designed to explain briefly how to decide what kind of disks you need and to show you how to transform a snail server into a F1 server.

Basic technology

There are four possible disk accesses, when you talk about SQL Server data access:

  • Random read: for example, when you query a table with a restrictive Where clause (e.g. SELECT * FROM MyTable WHERE MyID = 4).
  • Sequential read: for example, when you query a table without a Where clause, or with a non-restrictive Where Clause (e.g. SELECT * FROM MyTable).
  • Random write: for example, when you insert, update, delete a single record.
  • Sequential write: for example, when you insert a large amount of data through BCP or DTS.

The log is 99% of times only sequentially written.

Depending on the type of access you encounter the most frequently, your disk subsystem will be different.

There are three basic possibilities, as far as disks and databases are concerned:

  • One disk
  • Many independent disks
  • Stripped disks with or without parity.

If you have one disk, you have the worst disk subsystem ever: your data and you log share the same physical space, and no real performance enhancement is possible.

With many independent disks, you can split your log and your data and different disks, and you can even split your data on multiple disks, enhancing its access.

With stripped disks, your data (and your log if it's on the same array) is spread on all the stripped disks enhancing though the sequential access or the multi-user random access.

Depending on your application, you decide on the technology you are going to use, but let's first have a look at possible disk RAID technologies.

RAID

RAID stands for Redundant Array of Independent Disks. There are two possible implementations: software or hardware. Software RAID is not an option I am going to consider, not as a starting point. The performance of software RAID is poor compared to hardware RAID, and does not offer all the needed flexibility and reliability. Always consider using hardware RAID, even if it is more expensive, your money will be well used.

We are going to consider the number of I/O request for each technology, to compare their performance.

RAID 0

RAID 0, or disk stripping without parity, is only a performance enhancer. Data is spread on multiple disks (at least two), as stripes. This system is great for read and write disk accesses. Unfortunately, its security is poor. If you lose one disk, you lose the whole partition. This system is great for data of OLAP systems and Decision Support Systems.

RAID 0 is not fault tolerant, but has no overhead.

# of I/O requests = # of read requests + # of write requests

RAID 1

RAID 1, or disk mirroring, is great for security and for sequential read and writes. Data is copied on two disks at the same time, so if one crashes, the other one can be used to recover the data of the lost one, without user intervention. This system is great for transaction logs.

For each write request to the disk array, 2 write requests to the disks have to be issued (one per disk), so:

# of I/O requests = # of read requests + 2 x # of write requests

 

RAID 5

RAID 5, known as data stripping with parity, is the best-known security and performance enhancer disk system. The performance of the RAID 5 are good, though not as good as the RAID 0 ones, but it offers security. Widespread, RAID 5 suits small OLTP systems (few users, few transactions per second).

Each time the disk controller receive a write request, it must:

  1. Read the data block
  2. Read the parity block
  3. Write the new data block
  4. Write the new parity block

So:

# of I/O requests = # of read requests + 4 x # of write requests

As you can see, RAID 5 consumes 2.5 times more I/O requests than RAID 0

RAID 10

RAID 10 is the conjunction of RAID 0 and RAID 1: mirroring of a stripped partition. This system offers top performance and top security. Though its price, it is the best disk subsystem for data of intensive OLTP systems, like e-commerce sites or ERP.

The mirror causes the only overhead for the RAID 10. It combines the performance of the RAID 0 and the security of RAID 1.

# of I/O requests = # of read requests + 2 x # of write requests

Disks and Array controller

Now remains the issue of the controller. The controller has a maximum number of I/O requests that it can handle. A modern controller can handle around 1200 I/O request per second. If you have a ratio of 2 reads for 1 write (2W = R), the following table gives you the maximum practical I/O requests your disk array can handle.

RAID

Max write requests to the controller (W)

Max read request to the controller (R)

RAID 0 (IO = R+W)

400

800

RAID 1 (IO = R+2W)

300

600

RAID 5 (IO = R+4W)

200

400

RAID 10 (IO = R+2W)

300

600

As you can see from the previous table, a RAID 0 array can handle twice the load of a RAID, and a RAID 10, one and a half time the load of a RAID 5. Think about it! RAID 5 is not as good as you could expect

Number of disks

Now the expected performance of your disk subsystem depends on the number of disks you have in your arrays

The last generation of 10 Krpm disks support up to 100 I/O per second, meaning, it is useless to plug more than 12 disks per array (if it's possible). Let's do another calculation considering you need to support 200 transactions per second on your system with a cache hit ratio of 95 %. Apply a load on your test database and measure the disk requests with the Performance Monitor and these counters:

  • Transactions per second
  • Physical disk transfers per second
  • Physical reads per second
  • Physical writes per second
  • Cache Hit Ratio

Imagine you measure that you have with a cache hit ratio of 98 % (which is better of what you need), 2.12 physical reads per transaction and 1.3 writes per transaction. That's 3.42 I/Os per transaction, so:

3.42 * 200 = 684 I/Os per second

2.12 * 200 = 424 reads per second

1.3 * 200 = 260 writes per second

RAID 0: # of drives = (R+W)/(Max IO per drive) = (424+260)/100 = 6,48 = 7 disks

RAID 5: # of drives = (R+4W)/ (Max IO per drive) = (424+4*260)/100 = 14,64 = 15 disks

RAID 10: # of drives = (R+2W)/ (Max IO per drive) = (424+2*260)/100 = 9,44 = 10 disks

So, now you know how things are calculated, here's a quick table to compare price, performance and security of the different RAID systems, compared to a one disk system, for the previous assumptions:

Disk system

Performance

Security

Price

Disk capacity (with 9.1 Gb disks)

One Disk

1

None

1

9.1 Gb

RAID 0

7

None

7

63.7 Gb

RAID 10

7

Very Good

10

91 Gb

RAID 5

7

Good

15

127.4 Gb

What you can see here is that even with equal performances between RAID 10 and RAID 5, RAID 10 is less expensive (less disks) and more secure. But if you buy the same number of disks between RAID 10 and RAID 5, you will have a 50% of performance enhancement with RAID 10, compared to RAID 5

Wrap-up

The following tables resume what we've just seen. Every application should be studied before taking any decision, and these tables are not God's truth, but just hints.

Type of application

RAID 0

RAID 1

RAID 5

RAID 10

OLTP

No

Log

XX

 

Intensive OLTP (e-commerce, ERP)

No

Log

X

XXX

DSS

XXX

Log

 

XX

OLAP

XXX

Log

 

XX

 

Read Performance

Write Performance

Security

Type of application

RAID 0

Excellent

Excellent

None

OLAP, DSS

RAID 1

Good

Very Good

Very Good

Log, small OLTP

RAID 5

Very good

Poor

Good

OLTP

RAID 10

Excellent

Excellent

Very Good

Intensive OLTP








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers