Since its inception, the
Azure SQL Database offering has been evolving in order to accommodate a wide range of needs of its customers. This has resulted in two separate deployment models and pricing models, each with distinct tiers of service, representing unique performance and functionality characteristics.
Azure SQL Database Hyperscale is the latest manifestation of this trend, targeting scenarios where the storage requirements exceed the current database size limit of
4 TB. In this article, I will describe its architecture and primary benefits.
Initial implementation of
Azure SQL Database was geared towards minimizing management overhead, reflecting as closely as possible its
Platform as a Service designation. One of the implication of this approach was abstraction of the compute and storage capabilities, expressed in the form of
Database Throughput Units (
DTUs). Effectively, customers could simply specify desired
DTU level, which, in turn, would translate into such parameters as the amount of processing power, disk
IOPs, and maximum database size. On the other hand, this imposed a close relationship between compute and storage, constraining the ability to scale each of them independently. To address this challenge, Microsoft diversified its offering, which ultimately resulted in the following options:
- Deployment models:
Azure SQL Database logical servers including
single databases and
Azure SQL Database Managed Instances
- Purchasing models:
DTU-based purchasing model is inherently tied to the
Azure SQL Database logical serversdeployment model. It is available in three service tiers:
These tiers represent combined performance and storage capabilities (and some distinguishing features, such as
In-memory OLTP), further divided into individual
Performance Levels (
P15). There is limited support for storage scaling within higher-priced performance levels (starting with
S3), for up to
1 TB with the
Standard service tier and up to
4 TB with the
vCore-based purchasing model is supported with both
Azure SQL Database logical servers and
Azure SQL Database Managed Instances deployment models. Until the introduction of
Azure SQL Database Hyperscale, it had been available in two service tiers:
General Purpose (geared towards standard business workloads) and
Business Critical (targeting highly demanding
OLTP workloads). It gives customers the flexibility to choose independently between specific generation of hardware (
Gen4 based on
Intel E5-2673 v3 Haswell 2.4 GHz processor vs
Gen5 based on
Intel E5-2673 v4 Broadwell 2.3 GHz processor), the amount of computing resources (including virtual cores and memory), and the amount of storage for up to
4 TB (the upper limits for the number of virtual cores and the amount of memory actually differ between
Azure SQL Database Hyperscale is a new service tier (currently in public preview) of the
vCore-based purchasing model, available only for
single databases of the
Azure SQL Database logical servers deployment model. Its most evident benefit is support for database sizes for up to
100 TB(although databases in this tier do not actually have explicitly set maximum size). However, it is important to note that this capability results from an innovative design, which delivers a wide range of additional benefits.
The new design not only clearly delineates between the compute and storage, but includes a number of major changes that leverage this delineation in order to enhance performance and minimize duration and impact of such operations as scaling, backups, or restores. The core components of
Azure SQL Database Hyperscalearchitecture include:
compute nodes - host optimized query processing engine, no longer responsible for storage management. In preview, each deployment consists of two nodes, with the primary handling writes and the secondary available for reads and facilitating fail-over for high availability purposes. Both nodes implement
Resilient Buffer Pool Extension (
RBPEX) caching by utilizing local
SSD storage in order to minimize dependency of
IOoperations on remote storage.
log service node - relays log records initiated by writes on the primary compute node to the secondary compute node and page server nodes in order to facilitate database updates and ensure data consistency. This node is also responsible for persisting the transaction logs in the local cache and delivering them to
Azure Storageserving as a long-term repository.
page server nodes - host the partitioned storage engine, with up to
1 TB of database pages per node cached by using
Resilient Buffer Pool Extension (
RBPEX). The engine on each node handles updates to its respective set of database pages persisted by storage nodes in
storage nodes - provide long term storage for database content, ensure data durability by performing intra-region and inter-region replication, and facilitate snapshot-based backup and restore operations.
The primary advantages resulting from this architectural design include:
- Practically instantaneous snapshot-based backups, regardless of database size and with no impact on the compute operations. Using snapshots also minimizes duration and impact of restore operations.
- The ability to offload read operations to the secondary compute node (it is possible to increase the number of secondary compute nodes by submitting a request to
- Improved speed of vertical scaling of compute nodes, due to their separation from the storage nodes.
On the other hand, it is worth noting that some of the features available in the traditional
Azure SQL Database deployments, such as long term retention backups or
Transparent Data Encryption (
TDE) are not available when using the
Hyperscale service tier. It is also not clear at this point what will be the effective
IO throughput delivered by the service. For the preliminary pricing information, refer to the
Azure SQL Database pricing page.