MySQL in Amazon EC2 - Beginners Guide
August 17, 2010
Increasingly, enterprises are looking at cloud services such as Amazon Web Services for their deployments. Datacenter virtualization is much more sophisticated than its desktop counterpart is, providing a whole suite of tools that allow you to control every aspect of your deployments. Join Sean Hull as he explores the MySQL - Amazon EC2 solution.
Increasingly, enterprises are looking at cloud services such as Amazon Web Services for their deployments. At first glance, you might be thinking about virtualization based on how it works on the desktop. However, datacenter virtualization is much more sophisticated, providing a whole suite of tools from a web console to a full complement of command line tools. These allow you to control every aspect of your deployments from the size of instances, operating systems running on them, underlying disk storage, security and associated IP addresses.
Elastic Block Storage
For databases, whether they be Oracle, MySQL, SQL Server or any other, the first question that will probably come to mind is surrounding storage. Am I limited to the onboard storage that my server comes with? Luckily no, or database deployments in the cloud would not be what they are today.
It turns out that AWS has brought virtualization to storage too, with its Elastic Block Storage or EBS solution. This allows you to configure your server in any of a myriad of different ways.
EBS can be used for the root volume, making server spinup faster and allowing your machine to be persistent, even if you shutdown the instance for months on end. It also makes the bundling of new AMIs trivial, with the ec2-create-image command line too. Lastly, you can take snapshots of your root volume as a full root partition backup if you so desire.
Additional EBS volumes can be mounted as well, so for instance you could attach a 100G volume to /dev/sdg1, partition it, write an ext3 filesystem there, and then mount it on /data1. You could then modify your MySQL datadir setting from the default /var/lib/mysql to /data1.
Single EBS volume not fast enough for you? Then create multiple EBS volumes, and RAID0 stripe across them. Linux's mdadm is your friend for these types of configurations.
DNS & Elastic IP Addresses
The next obvious question that probably comes to mind is surrounding the virtual nature of the IP addresses and hostnames assigned to the EC2 virtual servers that you spinup. Each time you use ec2run to start an instance, the instance will be assigned a new external IP address and an internal one.
Want your server to keep the same IP address every time? No problem, because Amazon has Elastic IP addresses just for this purpose. Associate one to your server and it will maintain the same name of the form ec2-174-99-181-44.compute-1.amazonaws.com. So, that name will always point to your server. Great, but what about internet network traffic? For reasons of pricing, and performance both, you want your webserver to point to your database using the INTERNAL IP address and name, not the external one. What's the solution?
As it turns out, it's simpler than you might imagine. When you use the name ec2-174-99-181-44.compute-1.amazonaws.com from outside of EC2 instances, you'll resolve to 18.104.22.168 the external IP address you'd expect. However, a little documented feature of these elastic IPs is that on the internal EC2 network, that hostname will resolve to the INTERNAL IP. So your application can continue to address servers by that name. For instance, where the database connection string is specified in your app's config file, you would use that name, and it would automatically use the internal network. Inside the database, grants seem to want the internal IP name for example:
mysql> grant all privileges on myapp.* to 'myuser'@'ip-10-243-14-225.internal' identified by 'mypass';
That would allow the webserver 10.243.14.225 to hit the MySQL database and login. You could alternatively keep the webserver and the database in the same security group. This allows them to *SEE* each other without any port blocking. So that security group would not have to specify or open port 3306 at all. In fact if you use security groups this way, be sure not to open 3306 at all. Your grant would then use the wildcard hostname:
mysql> grant all privileges on myapp.* to 'myuser'@'%' identified by 'mypass';
Thus protecting you from a webserver reboot, or more webservers being added into the load balancer pool.
High availability of the sort that MMM exposes, depends in large part on more freely available internal IP addresses and the ability to make changes on the network level, which are not available in AWS. Enter the ns_agent module, which allows you to support an MMM configuration if you have access to a nameserver on that network.
Perhaps one day Amazon will have their own integrated DNS service available to customers, but for now these types of patchwork solutions are the only way to get MMM to work in the cloud.
If you are using MySQL for your web application, you might well consider Amazon's RDS solution. It is basically a somewhat managed MySQL instance.
Out-of-the-box so to speak it provides automated backups, with point-in-time-recovery, and a nice API tool to do just that (RestoreDBInstanceToPointInTime). It also supports snapshots in addition to the automatic backups. RDS also simplifies the database administration role by managing the patching of your database. Lastly, there is an optional Multi-AZ Deployment solution. Multi-AZ refers to multiple availability zones, and provides automatic replication of your data to separate datacenters or availability zones as Amazon calls them.
What's left for a DBA to do? Well luckily, you still have your fingers on a lot of the dials, and can tune the typical my.cnf settings which control memory usage, query cache, sort buffers, innodb buffer cache, and so on. Of course you still have to do query tuning, and put indexes in the all the right places. Having a few of the administrative tasks taken care of lets you focus on the real work of keeping your application running smoothly.
Amazon's RDS solution, like other virtual servers in EC2 has a few other handy features that you'll want to take note of. You can dynamically change an instance class, i.e. move from small, to medium to large to extra large server with a call to rds-modify-db-instance. Those instances support 1.7G of memory on the small size to 68G of memory on the largest large size! If that's not enough for you, you can also use the same command line tool to dynamically change storage, adding as your needs grow!
MySQL security is as important in the cloud as it is in traditional hosting environments. However, in the cloud you may not have an enterprise wide firewall as an additional layer of protection, so be doubly careful about which ports are open! Use Amazon Security groups to set policies on open ports, and source IPs. In the case of MySQL, if your various servers are all in the same security group, they can all see each other. So for instance, the webserver and database are both in the default security group, which opens port 22 for ssh to the world. So the webserver will automatically be able to hit the databases port 3306 without any additional configuration because they are in the same security group.
MySQL is increasingly becoming the database of choice for backing internet web applications. As those applications continue to move to the cloud, so will MySQL databases. Virtualization brings tremendous benefits in terms of flexibility and ease of deployments, but it also requires some rethinking. As always, a slow and tempered adoption with lots of testing will likely bring the best results.