Database Storage Recommendation


Database administrators and system administrators often ask the same question, ‘what storage do you recommend for mission critical databases, and why?’ Bo Chen examines the different device types available, making recommendations for different database platforms.

Background

Often times, DBAs or system
administrators will ask the same question, what storage do you recommend for
mission critical databases, and why?

Some companies may require migrating
database storage from SAN to NAS/NFS, or vice versa, for various reasons,
including corporate storage platform migration, storage rearchitecturing, cost
saving etc. Personally, I prefer SAN to NAS/NFS for database devices. Here’s my
ratings based on general metrics for performance and reliability purposes
(ratings descending):

1.    RAMSAN
(using SSD disks)

2.    SAN
(fiber channel SAN, not iSCSI SAN, and using SAS, SATA, FATA, to FCAL disks,
but excluding SSD disks)

3.    DAS
(direct attached storage using external disk array)

4.    DAS(direct
attached storage using local disk drives)

Device types available for databases

Regardless of what type of
storage you’re using, or what vendor/model the storage is based on, below are my
ratings for the storage format for databases (based on performance and
reliability, ratings descending):

1. Raw device

Raw device is not a precise
term, as any volume carved out using either third party volume manager (such as
Veritas) or local VM (such as IBM AIX LVM, or SUN disksuite, or RedHat LVM) are
logical devices, rather than raw partitions. Therefore, I personally prefer to
use the term "character device", as when "ls -l" the
device, the first letter is "c", or with a symbolic link under
"/dev/rdsk", rather than "b" for block device or with a
symbolic link under "/dev/dsk", which I’ll get to in the next part.

Sybase DBAs are more of a fan
of using raw devices than any other DBAs on other platforms, such as Oracle
DBA, Informix DBA or SQL DBA, again, for performance and reliability purposes,
obviously, some other DBA’s would argue, the manageability is poor(which I
disagree). For Oracle, the introduction of ASM really gave raw device a big
boost. Oracle DBAs normally deal with data files under a file system. ASM is
the one that not only provides the performance and reliability that we normally
see only in raw devices, but also hides the complexity (if any) of managing raw
devices, while still letting DBAs directly deal with files under ASM, bringing data
file management to the next level, which is "automated".

Note: raw disk I/O has been
deprecated by Linux, so use block devices in Linux instead.

2. Block device

To many Unix/Linux
administrators, block devices are more familiar to them than raw devices, as
normally only DBAs (especially Sybase DBAs, and now Oracle DBAs as well) would
require raw devices for databases. Most users only use files under file
systems, which are mounted using block devices that are either logical volumes
managed by LVM or partitions managed by fdisk, with a device name starting with
letter "b".

3. Files under file systems (or files under drive letters, for SQL DBAs)

This is the easiest and most
popular storage format for most of users. Some DBAs may even notice that
sometimes the database performs better in terms of speed when using files
instead of raw devices, because the file I/O is using some OS buffer. The down
side is, you’re running a risk of waiting for a long running fsck with
Unix/Linux or scandisk/Windows, especially when using DAS with local disks
either without write cache or write cache is not configured correctly, after a
server crash or reboot, which is definitely not acceptable for a mission
critical database.

Software RAID versus Hardware RAID

Generally I’d avoid using
software RAID, unless you can’t afford hardware RAID (which I really doubt, as
nowadays you can even get a PC with built in RAID controller for a bargain
price). The reason is obvious; you have to waste some CPU cycle and memory to
calculate parity (for RAID5), or do the mirroring or the striping, even though
the VM’s daemon might claim this is all done in the background, typically in
kernel mode. The second drawback is stability, as a kernel mode level of system
development usually has a high bar for the developers, and unlike application
software development, any bug in the system software could easily crash the
server.

The last drawback is the portability.
A disk group configured with Veritas Volume Manager for Windows cannot be
imported to a Linux machine. This is also true with Oracle’s ASM; I’d always
create ASM DG’s using the external redundancy option, even though ASM provides
its own way of striping and mirroring across different disk groups.

The advantages of using
hardware RAID are apparent: performance, efficiency and stability. The
components for the RAID implementations are embedded in the controller of the
storage device, which has its own embedded processors.

First , the RAID firmware is
executed on a dedicated processor embedded in the controller, which doesn’t
need to share CPU with the application or other OS components that run in
kernel mode, hence all the parity calculation, I/O handling have no CPU
contention.

Second, the use of a dedicated
cache on the controller, for reading and writing, provides another level of
asynchronized
I/O, which greatly improves performance.

Third, most of the hardware
RAID solutions are equipped with their own power supply and battery backup.
Working Together with the cache, this dramatically reduces the chance of losing
data due to a server crash or power outage.

RAID Matrix for Different Database Platforms

Below I’ll provide a table that
lists the best RAID levels for different database files on different database
platforms. I’m only listing Oracle, MS SQL and Sybase, as those are some of the
most popular databases used in mission critical environments. You can draw
similar conclusions for other databases such as UDB/DB2, PostgreSQL, MySQL,
Informix etc.

RDBMS

(OLTP databases)

RAID 0

RAID 1

RAID 5

RAID 1+0

tmpfs(using RAM with Solaris only)

RAMSAN/SSD

MS SQL

tempdb,ldf

Master,msdb

mdf

mdf,ldf

tempdb, ldf

Oracle

Redo log files

Control files

Data files

Control files or data files

Redo log files

Sybase

Log segment, tempdb

Master DB and other system DB’s except model and tempdb

Data segment

Index segment

tempdb

Log segment, tempdb

Recommended Device type for different database platforms

RDBMS

(OLTP databases)

Raw Device

Block Device

File System (or Drive letters under Windows)

Windows mounted folder

MS SQL

Best (as long as the drives are big enough or there are enough
drive letters)

Best(when drives have a size limit or running out of drive
letters)

Oracle

ASM on Raw devices for all Unix platforms

ASM on block device for Linux

Sybase

All Unix platforms

Highly recommended for stable queue used in Sybase replication
server

Linux only

Best for ASE 15 and above, with Direct I/O turned on

About NAS/NFS

Finally, you might have noticed
I didn’t list NAS/NFS for any database devices even though they’re supported by
all the databases mentioned above. Well, I’m still a little conservative about
using NAS for mission critical databases, as per my experience, it still can’t
compete with SAN or DAS in terms of speed and stability. However, I’d always
recommend to back up databases to NAS devices. Moreover, I never had any issues
using it for Sybase, MS SQL backups, and it also works perfectly for archive
log backups for my Oracle RAC.

Additional Resources

Oracle: Automatic Storage Management

Sybase: Performance on Sybase Devices Using
Direct I/O, Dsync and Raw Partition
MSDN: Mounted Folders

HP Oracle Database Machine
Data Management/Storage

»


See All Articles by Columnist

Bo Chen

Bo Chen
Bo Chen
Bo has been in the IT industry for about 15 years with main focus on database technologies. Bo started as a system developer and worked as system analyst, system integration manager, database consultant, database manager and is currently holding a position as database engineering team lead in DRW Holdings, LLC. Bo believes database systems are the most sophisticated system software and are so dynamic, which makes a database professionals' job the most interesting and challenging among all in IT fields. Bo loves his twin daughters and fishing.

Latest Articles