Brief intro
In our previous
article, we looked at the clustering possibilities across two or more ESX
Servers. In this article, we will take a detailed look at various possibilities
of building clusters across several physical and ESX hosts since we weren’t able to pick that one up in our
last article. In addition, we will take a quick look at upgrading clustered
Virtual Machines in all the three scenarios. There is a very good chance that
you have a Oracle RAC test or development cluster on an ESX 2.5 version and
want to move over to the latest ESX 3.x version (the latest being ESX 3.0.2 as
of last week).
Clustering Oracle RAC Virtual Machines across physical and ESX hosts
I speak to several clients who are
running their production Oracle environments on VMware. The choice of running
Oracle RAC differs per organization but I firmly believe that it is
possible to have a DSS (Decision Support System) running on Oracle RAC (which
normally has large transactions and less concurrent users) on
ESX Servers in production. On a typical OLTP environment, it might
not be smart to deploy RAC on ESX without any planning but a DSS can surely run
fine. Moreover, there is no reason of not trying it on your
ESX system. There are already enough test, development and staging deployments
running on ESX servers.
Now to take a quick look at the tasks we
need to perform to build the cluster:
-
Physical Node: It
must have network adapters (2 NIC cards), it must have access to the same
storage (SAN LUN volumes) as that of the ESX server (this is for visibility,
both guest Virtual Machines and the physical machine must be able to see the
shared volume) and the OS version and patches must be identical on all
platforms (virtual or physical). Also, note that there shouldn’t be
multipathing software running on the physical node. -
Virtual Node:
Here the steps are pretty much the same. The ESX host must have at least 2
physical NICs, although it’s advisable to have three (one for service console,
two NICs for teaming/bonding and redundancy), the VM must have two vNICs
(Virtual NICs) – one for outbound and the other connected to a private VLAN for
high-speed interconnects for cache fusion. -
Adding shared storage: Please follow the same steps as we did in our clustering VMs
across multiple ESX hosts. On the Virtual Machine (physical node is a simple
procedure— it’s a simple matter of assigning your HBA to the mapped SAN LUN and
you are done) you will click add storage and choose Mapped SAN LUN, the hard
disks point to the LUN using RDM (Raw Device Mapping). In the LUN selection,
you choose the same LUN (Logical Unit Name) that is being accessed by the
physical node(s). Then select the virtual device node on a different SCSI
controller hence creating a new SCSI controller. Edit the new SCSI (1: 0)
controller properties and change the sharing to “physical”. Carry out the same
step for all the shared disks (OCR.vmdk, VOTINGDISK.vmdk, SPFILEASM.vmdk,
ASM01.vmdk and so on). Upon clicking finish, you are done. -
The final step is obviously to install and
configure the Oracle RAC clusterware and database.
Upgrading your RAC cluster
Upgrading your ESX server or your cluster
software is not an easy task. We will not go too deep into ESX server upgrade
as it is out of the scope of this article but will concentrate on several
scenarios such as upgrading clusters on one ESX server, across physical hosts
or on a typical heterogeneous cluster (physical and virtual nodes):
-
Upgrading the cluster on one ESX host: Power off your VM, let your system admin upgrade your ESX server
from 2.5 to 3.x; upgrade your VMFS2 to VMFS3. This you do by opening the VI
client, selecting the volume and click “upgrade to VMFS3”, upgrade the shared
RDM files if necessary, right click each cluster in the inventory panel and
click upgrade virtual hardware. Restart the cluster. Should for any reason you
run into an error, try importing the backup vmdks like this:
vmkfstools -I /vmfs/volumes/vol1/<old-disk>.vmdk /vmfs/volumes/vol2/<RACDir>/<new-disk>.vmdk
Then rename
the old-disk.vmdk and edit the >vmx file to point to the new-disk.vmdk.
Restart the cluster successfully.
Upgrading cluster across ESX hosts: You could do this using shared pass-through RDM and with shared
file systems.
-
Using shared pass-through RDM: Here you first upgrade your ESX server from 2.5 to 3.x. Via the VI
client, upgrade your shared pass-through RDM files from VMFS2 to VMFS3, right
click the cluster VM and select “upgrade virtual hardware”. Do the same for the
boot disk and you are done. Turn on your cluster and verify the upgrade. -
Using files in shared (VMFS2) volumes: Do the following before upgrading to VMFS3:
vmkfstools -L lunreset vmhba<C:T:L>:0 vmkfstools -F public vmhba<C:T:L:P>
This makes
the shared files public. Then do the ESX host upgrades from ESX 2.5 to ESX 3.0.
Choose the first upgraded node in the configuration tab and click “storage”:
upgrade the VMFS2 disks in your cluster by clicking “Upgrade to VMFS3”, create LUNs
for each of the shared RAC disks, create a RDM for each shared disk and import
the virtual disk to this RDM:
vmkfstools -i /vmfs/volumes/vol1/<old-disk>.vmdk /vmfs/volumes/vol2/<RACDir>/<rdm-for-vmrac01>/<myrdm.vmdk> -d rdmp:/vmfs/devices/disks/vmhbax.y.0
Here:
old-disk.vmdk:
our RAC vmdk which is to be imported.
myrdm.vmdk:
New RDM for vmrac01 (Our first node)
vmhba1.2:3:Tthe LUN that backs the RDM
Now edit the virtual machine’s configuration file (vmrac01.vmx) to
point to the RDM instead of the shared file by doing the following: scsi<X>:<Y>.<filename>
= “rdm-fxy vmrac01/.vmdk. Restart the cluster and check
for its liveliness.
Conclusion
Although the VMware ESX server has
several models of clustering and HA, we should not forget that some mission
critical applications like Oracle RAC cannot be fully replaced by other OS or
even infrastructure level clustering and high availability. The whole purpose
of demonstrating the Oracle RAC on ESX is not only to solidify the business
imperative in a consolidated setup for test and development purposes but also
that you as an administrator have the “RAC running under your desk!” The fact
that we can run and even setup, test and benchmark mission-critical
applications in our own premises gives us the power to be on top of our
applications and businesses.