Increasingly, enterprises are looking at cloud services such as Amazon Web Services for their deployments. Datacenter virtualization is much more sophisticated than its desktop counterpart is, providing a whole suite of tools that allow you to control every aspect of your deployments. Join Sean Hull as he explores the MySQL – Amazon EC2 solution.
Introduction
Increasingly, enterprises are looking at cloud services such as Amazon Web
Services for their deployments. At first glance, you might be thinking about
virtualization based on how it works on the desktop. However, datacenter
virtualization is much more sophisticated, providing a whole suite of tools
from a web console to a full complement of command line tools. These allow you
to control every aspect of your deployments from the size of instances, operating
systems running on them, underlying disk storage, security and associated IP
addresses.
Elastic Block Storage
For databases, whether they be Oracle, MySQL, SQL Server or any other, the
first question that will probably come to mind is surrounding storage. Am I
limited to the onboard storage that my server comes with? Luckily no, or
database deployments in the cloud would not be what they are today.
It turns out that AWS has brought virtualization to storage too, with its
Elastic Block Storage or EBS solution. This allows you to configure your server
in any of a myriad of different ways.
EBS can be used for the root volume, making server spinup faster and allowing
your machine to be persistent, even if you shutdown the instance for months on
end. It also makes the bundling of new AMIs trivial, with the ec2-create-image
command line too. Lastly, you can take snapshots of your root volume as a full
root partition backup if you so desire.
Additional EBS volumes can be mounted as well, so for instance you could attach
a 100G volume to /dev/sdg1, partition it, write an ext3 filesystem there, and
then mount it on /data1. You could then modify your MySQL datadir setting from
the default /var/lib/mysql to /data1.
Single EBS volume not fast enough for you? Then create multiple EBS volumes,
and RAID0 stripe across them. Linux’s mdadm is your friend for these types of
configurations.
DNS & Elastic IP Addresses
The next obvious question that probably comes to mind is surrounding the
virtual nature of the IP addresses and hostnames assigned to the EC2 virtual
servers that you spinup. Each time you use ec2run to start an instance, the
instance will be assigned a new external IP address and an internal one.
Want your server to keep the same IP address every time? No problem, because
Amazon has Elastic IP addresses just for this purpose. Associate one to your server
and it will maintain the same name of the form
ec2-174-99-181-44.compute-1.amazonaws.com. So, that name will always point to
your server. Great, but what about internet network traffic? For reasons of
pricing, and performance both, you want your webserver to point to your
database using the INTERNAL IP address and name, not the external one. What’s
the solution?
As it turns out, it’s simpler than you might imagine. When you use the name
ec2-174-99-181-44.compute-1.amazonaws.com from outside of EC2 instances, you’ll
resolve to 174.99.181.44 the external IP address you’d expect. However, a
little documented feature of these elastic IPs is that on the internal EC2
network, that hostname will resolve to the INTERNAL IP. So your application can
continue to address servers by that name. For instance, where the database
connection string is specified in your app’s config file, you would use that
name, and it would automatically use the internal network. Inside the database, grants seem to want the internal IP name for example:
mysql> grant all privileges on myapp.* to 'myuser'@'ip-10-243-14-225.internal' identified by 'mypass';
That would allow the webserver 10.243.14.225 to hit the MySQL database and login. You could alternatively keep the webserver and the database in the same security group. This allows them to *SEE* each other without any port blocking. So that security group would not have to specify or open port 3306 at all. In fact if you use security groups this way, be sure not to open 3306 at all. Your grant would then use the wildcard hostname:
mysql> grant all privileges on myapp.* to 'myuser'@'%' identified by 'mypass';
Thus protecting you from a webserver reboot, or more webservers being added into the load balancer pool.
High Availability
High availability of the sort that MMM exposes, depends in large part on more
freely available internal IP addresses and the ability to make changes on the
network level, which are not available in AWS. Enter the ns_agent module, which
allows you to support an MMM configuration if you have access to a nameserver
on that network.
Perhaps one day Amazon will have their own integrated DNS service available to
customers, but for now these types of patchwork solutions are the only way to
get MMM to work in the cloud.
Amazon RDS
If you are using MySQL for your web application, you might well consider
Amazon’s RDS solution. It is basically a somewhat managed MySQL instance.
Out-of-the-box so to speak it provides automated backups, with
point-in-time-recovery, and a nice API tool to do just that
(RestoreDBInstanceToPointInTime). It also supports snapshots in addition to the
automatic backups. RDS also simplifies the database administration role by
managing the patching of your database. Lastly, there is an optional Multi-AZ
Deployment solution. Multi-AZ refers to multiple availability zones, and
provides automatic replication of your data to separate datacenters or
availability zones as Amazon calls them.
What’s left for a DBA to do? Well luckily, you still have your fingers on a lot
of the dials, and can tune the typical my.cnf settings which control memory
usage, query cache, sort buffers, innodb buffer cache, and so on. Of course you
still have to do query tuning, and put indexes in the all the right places. Having
a few of the administrative tasks taken care of lets you focus on the real work
of keeping your application running smoothly.
Amazon’s RDS solution, like other virtual servers in EC2 has a few other handy
features that you’ll want to take note of. You can dynamically change an
instance class, i.e. move from small, to medium to large to extra large server
with a call to rds-modify-db-instance. Those instances support 1.7G of memory
on the small size to 68G of memory on the largest large size! If that’s not
enough for you, you can also use the same command line tool to dynamically
change storage, adding as your needs grow!
Security
MySQL security is as important in the cloud as it is in traditional hosting
environments. However, in the cloud you may not have an enterprise wide
firewall as an additional layer of protection, so be doubly careful about which
ports are open! Use Amazon Security groups to set policies on open ports, and
source IPs. In the case of MySQL, if your various servers are all in the same
security group, they can all see each other. So for instance, the webserver and
database are both in the default security group, which opens port 22 for ssh to
the world. So the webserver will automatically be able to hit the databases
port 3306 without any additional configuration because they are in the same
security group.
Conclusion
MySQL is increasingly becoming the database of choice for backing internet web
applications. As those applications continue to move to the cloud, so will
MySQL databases. Virtualization brings tremendous benefits in terms of
flexibility and ease of deployments, but it also requires some rethinking. As
always, a slow and tempered adoption with lots of testing will likely bring the
best results.
Additional Resources
MySQL Enterprise for Amazon EC2
Best Practices for MySQL on Amazon EC2
Running MySQL on Amazon EC2 with EBS (Elastic Block Store)