When comes the time of implementation, the choice of the server is crucial. Generally, the IT Team focus on processor speed, amount of RAM, amount of disk, security, but few IT Teams choose the right disk subsystem for their SQL Server. This paper is designed to explain briefly how to decide what kind of disks you need and to show you how to transform a snail server into a F1 server.
Basic technology
There are four possible disk accesses, when you talk about SQL Server data access:
- Random read: for example, when you query a table with a restrictive Where clause (e.g. SELECT * FROM MyTable WHERE MyID = 4).
- Sequential read: for example, when you query a table without a Where clause, or with a non-restrictive Where Clause (e.g. SELECT * FROM MyTable).
- Random write: for example, when you insert, update, delete a single record.
- Sequential write: for example, when you insert a large amount of data through BCP or DTS.
The log is 99% of times only sequentially written.
Depending on the type of access you encounter the most frequently, your disk subsystem will be different.
There are three basic possibilities, as far as disks and databases are concerned:
- One disk
- Many independent disks
- Stripped disks with or without parity.
If you have one disk, you have the worst disk subsystem ever: your data and you log share the same physical space, and no real performance enhancement is possible.
With many independent disks, you can split your log and your data and different disks, and you can even split your data on multiple disks, enhancing its access.
With stripped disks, your data (and your log if it’s on the same array) is spread on all the stripped disks enhancing though the sequential access or the multi-user random access.
Depending on your application, you decide on the technology you are going to use, but let’s first have a look at possible disk RAID technologies.
RAID
RAID stands for Redundant Array of Independent Disks. There are two possible implementations: software or hardware. Software RAID is not an option I am going to consider, not as a starting point. The performance of software RAID is poor compared to hardware RAID, and does not offer all the needed flexibility and reliability. Always consider using hardware RAID, even if it is more expensive, your money will be well used.
We are going to consider the number of I/O request for each technology, to compare their performance.
RAID 0
RAID 0, or disk stripping without parity, is only a performance enhancer. Data is spread on multiple disks (at least two), as stripes. This system is great for read and write disk accesses. Unfortunately, its security is poor. If you lose one disk, you lose the whole partition. This system is great for data of OLAP systems and Decision Support Systems.
RAID 0 is not fault tolerant, but has no overhead.
# of I/O requests = # of read requests + # of write requests
RAID 1
RAID 1, or disk mirroring, is great for security and for sequential read and writes. Data is copied on two disks at the same time, so if one crashes, the other one can be used to recover the data of the lost one, without user intervention. This system is great for transaction logs.
For each write request to the disk array, 2 write requests to the disks have to be issued (one per disk), so:
# of I/O requests = # of read requests + 2 x # of write requests
RAID 5
RAID 5, known as data stripping with parity, is the best-known security and performance enhancer disk system. The performance of the RAID 5 are good, though not as good as the RAID 0 ones, but it offers security. Widespread, RAID 5 suits small OLTP systems (few users, few transactions per second).
Each time the disk controller receive a write request, it must:
- Read the data block
- Read the parity block
- Write the new data block
- Write the new parity block
So:
# of I/O requests = # of read requests + 4 x # of write requests
As you can see, RAID 5 consumes 2.5 times more I/O requests than RAID 0
RAID 10
RAID 10 is the conjunction of RAID 0 and RAID 1: mirroring of a stripped partition. This system offers top performance and top security. Though its price, it is the best disk subsystem for data of intensive OLTP systems, like e-commerce sites or ERP.
The mirror causes the only overhead for the RAID 10. It combines the performance of the RAID 0 and the security of RAID 1.
# of I/O requests = # of read requests + 2 x # of write requests
Disks and Array controller
Now remains the issue of the controller. The controller has a maximum number of I/O requests that it can handle. A modern controller can handle around 1200 I/O request per second. If you have a ratio of 2 reads for 1 write (2W = R), the following table gives you the maximum practical I/O requests your disk array can handle.
RAID | Max write requests to the controller (W) | Max read request to the controller (R) |
RAID 0 (IO = R+W) | 400 | 800 |
RAID 1 (IO = R+2W) | 300 | 600 |
RAID 5 (IO = R+4W) | 200 | 400 |
RAID 10 (IO = R+2W) | 300 | 600 |
As you can see from the previous table, a RAID 0 array can handle twice the load of a RAID, and a RAID 10, one and a half time the load of a RAID 5. Think about it! RAID 5 is not as good as you could expect
Number of disks
Now the expected performance of your disk subsystem depends on the number of disks you have in your arrays
The last generation of 10 Krpm disks support up to 100 I/O per second, meaning, it is useless to plug more than 12 disks per array (if it’s possible). Let’s do another calculation considering you need to support 200 transactions per second on your system with a cache hit ratio of 95 %. Apply a load on your test database and measure the disk requests with the Performance Monitor and these counters:
- Transactions per second
- Physical disk transfers per second
- Physical reads per second
- Physical writes per second
- Cache Hit Ratio
Imagine you measure that you have with a cache hit ratio of 98 % (which is better of what you need), 2.12 physical reads per transaction and 1.3 writes per transaction. That’s 3.42 I/Os per transaction, so:
3.42 * 200 = 684 I/Os per second
2.12 * 200 = 424 reads per second
1.3 * 200 = 260 writes per second
RAID 0: # of drives = (R+W)/(Max IO per drive) = (424+260)/100 = 6,48 = 7 disks
RAID 5: # of drives = (R+4W)/ (Max IO per drive) = (424+4*260)/100 = 14,64 = 15 disks
RAID 10: # of drives = (R+2W)/ (Max IO per drive) = (424+2*260)/100 = 9,44 = 10 disks
So, now you know how things are calculated, here’s a quick table to compare price, performance and security of the different RAID systems, compared to a one disk system, for the previous assumptions:
Disk system | Performance | Security | Price | Disk capacity (with 9.1 Gb disks) |
One Disk | 1 | None | 1 | 9.1 Gb |
RAID 0 | 7 | None | 7 | 63.7 Gb |
RAID 10 | 7 | Very Good | 10 | 91 Gb |
RAID 5 | 7 | Good | 15 | 127.4 Gb |
What you can see here is that even with equal performances between RAID 10 and RAID 5, RAID 10 is less expensive (less disks) and more secure. But if you buy the same number of disks between RAID 10 and RAID 5, you will have a 50% of performance enhancement with RAID 10, compared to RAID 5
Wrap-up
The following tables resume what we’ve just seen. Every application should be studied before taking any decision, and these tables are not God’s truth, but just hints.
Type of application | RAID 0 | RAID 1 | RAID 5 | RAID 10 |
OLTP | No | Log | XX | |
Intensive OLTP (e-commerce, ERP) | No | Log | X | XXX |
DSS | XXX | Log | XX | |
OLAP | XXX | Log | XX |
Read Performance | Write Performance | Security | Type of application | |
RAID 0 | Excellent | Excellent | None | OLAP, DSS |
RAID 1 | Good | Very Good | Very Good | Log, small OLTP |
RAID 5 | Very good | Poor | Good | OLTP |
RAID 10 | Excellent | Excellent | Very Good | Intensive OLTP |