In our articles recently published on this forum, we have been covering high availability and disaster recovery features incorporated into Azure SQL Database. However, it is important to point out that Microsoft cloud platform also allows you to provide resiliency for your on-premises SQL Server deployments. In this article, we will present such functionality, available as part of the Azure Site Recovery solution.
Azure Site Recovery is a cloud-based service that facilitates the process of implementing disaster recovery plans. This is accomplished by delivering two sets of complementing features – replication and orchestration. Replication handles the synchronization of protected systems between a primary and secondary site. Orchestration manages the process of planned, unplanned, and test failover between the two sites through a sequence of steps (referred to as a recovery plan), controlling their order and, optionally, automating their execution (by taking advantage of Azure Automation runbooks or PowerShell scripts). The secondary site can be hosted in another on-premises location or it can be provisioned as a set of Azure Infrastructure as a Service (IaaS) virtual machines co-located on the same Azure virtual network. Azure Site Recovery is also capable of accommodating a variety of workloads and platforms. In particular, it supports both physical and virtual servers, with the latter being hosted on VMware ESX or Microsoft Hyper-V (with or without the System Center Virtual Machine Manager management layer) platforms.
While any of these recovery scenarios constitutes a viable option (depending primarily on your existing environment), we will focus here on the approach that leverages Azure IaaS deployment as the recovery site (we will explore others in our upcoming articles). In this case, there are several different design choices that allow you to deliver SQL Server multi-site resiliency:
- AlwaysOn Failover Cluster Instance (FCI) residing on-premises and hosting the primary replica of an AlwaysOn Availability Group with its secondary replica hosted in an Azure IaaS VM running a SQL Server instance. Note that the FCI must reside on-premises due to its dependency on shared storage, which is currently not supported in Azure virtual machines. In addition, keep in mind that AlwaysOn Availability Group requires SQL Server Enterprise, which has obvious license-related pricing implications. For Azure VMs, licensing requirements can be addressed either by deploying a gallery-based image (in which case the corresponding cost is included in per-minute compute charges) or by applying an existing license (BYOL), assuming that you have a Software Assurance agreement (which offers License Mobility). When leveraging your existing license, make sure to deploy a gallery Windows Server image (or a custom one) in order to avoid paying double (since gallery SQL Server images already include licensing charges). Finally, remember to choose the asynchronous commit for remote replica, to avoid negative performance impact due to the latency of your network connection to Azure.
- AlwaysOn Failover Cluster Instance (FCI) residing on-premises and hosting a local instance of a mirrored database, with its remote instance hosted in an Azure IaaS VM running a SQL Server instance. As we already pointed out, the FCI must reside on-premises due to its dependency on shared storage. However, in this case we no longer have the requirement to deploy SQL Server Enterprise, since Database Mirroring is also supported with Standard Edition, which significantly reduces licensing costs. While with SQL Server Standard there is the limit of two nodes per cluster, this is unlikely to be a significant constraint (since you are limited to a single primary and a single secondary). Note that Database Mirroring supports both synchronous (high safety) and asynchronous (high performance) modes, suited, respectively, for high availability and disaster recovery. Just as with AlwaysOn Availability Group, you should choose the latter to account for latency of your network connection to Azure.
- AlwaysOn Availability Group residing on-premises and consisting of two or more SQL Server instances replicating synchronously, with another remote instance hosted in an Azure IaaS VM running a SQL Server instance and configured with asynchronous commit. While this necessitates the use of the Enterprise Edition in both environments, it offers improved Recovery Time Objective (RTO) and Recovery Point Objective (RPO), as well as superior scalability (with support of up to 8 replicas, starting with SQL Server 2014). Keep in mind that this approach does not preclude the use of Failover Clustering Instance, although note that only one FCI partner can host a replica for a given availability group.
- Standalone SQL Server instance residing on-premises and replicating to an Azure Storage account by using Azure Site Recovery. The replication targets are page blobs containing the vhd files of Azure IaaS virtual machines hosting SQL Server instances that are brought online during failover.
Note that in this case, Azure Site Recovery handles not only the orchestration but also provides replication functionality.
In general, you can use a combination of SQL Server-specific high-availability and resiliency features along with the capabilities of Azure Site Recovery to design the solution that meets your business continuity objectives within the budget limits imposed by your financial goals. In addition, you need to take into consideration infrastructure prerequisites including, for example, Active Directory authentication and connectivity between on-premises and Azure. We will examine these requirements as well as describe the process of implementing more common resiliency scenarios in our upcoming articles.