One of the most important design decisions that must be made when planning deployment of Azure Cosmos DB involves logical partitioning of data that will populate target collections, graphs, or tables. Selecting the optimal partitioning model has both performance and pricing implications. In this article, we will explore the rationale behind these implications and review the partitioning options.
Azure Cosmos DB provides guarantees in regard to end-to-end latency for 99 percentile of reads (under 10 ms) and writes (under 15 ms) within the same Azure region. At the same time, it also offers the 99.99% availability SLA within the same region and 99.999% read availability for multi-region database deployments. Delivering these levels of service relies on the database partitioning model, which balances performance and availability requirements.
Before we focus on partitioning aspects of Azure Cosmos DB, it is worth reviewing the physical and logical structure of its components. In order to start populating partitions with your data, you first need to create an Azure Cosmos DB account (and designate the type of API you will be using to access its content), a database, and a container. A database serves as a logical grouping of one or more containers. Each container has specific storage capacity and throughput characteristics, which you must first designate at creation time and which you can adjust afterwards on an as-needed basis. Your choices regarding the capacity and throughput affect physical and logical partitions of the container. Note that both types of partitions are administered automatically by the platform. On one hand, this eliminates the overhead associated with partition management, on the other it underscores the importance of selecting the proper partitioning scheme, which dictates how partitions will be structured.
A physical partition is in essence a fixed amount of high-performance storage, which is capable of delivering up to a specific amount of throughput, roughly equivalent to performance of Solid State Disk (SSD) drives. The platform also ensures high-availability of physical partitions by creating multiple copies of their content and maintaining their synchronization. The number of physical partitions created by the platform is determined by requested performance, expressed in terms of request units (RUs) that you specify. (A request unit is a Cosmos DB-specific measurement representing a combination of compute, memory, and IO resources dedicated to processing your requests). A physical partition can contain one or more logical partitions; however a logical partition cannot span multiple physical partitions. The maximum size of a logical partition is 10 GB. This limit makes the choice of the partition key critical.
The platform handles distribution of logical partitions across physical partitions. This distribution can be automatic, depending on whether you designated the container (at creation time) as fixed or unlimited. Fixed containers have a maximum size equal to the size of a single logical partition (10 GB), are allocated to a single physical partition, and as a consequence are limited to 10,000 RU throughput. This effectively limits their potential growth and performance capacity. Unlimited containers support automatic auto-scaling of their logical partitions, but at the same time they must also comply with the restriction that a single logical partition cannot exceed the 10 GB limit. Auto-scaling allows splitting content of a single physical partition hosting multiple logical partitions into separate physical partitions once the storage limit of the underlying physical partition is reached. Following the split, each of the physical partitions hosts a distinct subset of logical partitions that were originally residing together on the single physical partition.
Creating an unlimited container requires allocating at least 1,000 RUs and in addition, specifying the intended partition key. Alternatively, you have the option of converting a fixed container into an unlimited one, as long as such container satisfied the same set of criteria at the creation time. If these criteria were not satisfied, then transitioning to an unlimited container requires performing data migration.
In addition to logical partition maximum size considerations, the choice of partition key is also of paramount significance when it comes to performance. The reason for it is that the RUs you specify are allocated on the container level. Effectively, the available throughput is distributed across all physical partitions. In order to be able to fully utilize the allocated resources, each physical partition should be utilized at roughly the same level. Otherwise, you might experience degraded performance on more heavily utilized logical partitions due to throttling imposed by resource constraints.
In order to optimize performance, while at the same time avoid reaching the partition size limits, the choice of the partition key should reflect the data distribution and usage patterns. In other words, volume of requests targeting data sharing the same partition key should remain within the RU limit associated with a single partition, while at the same time, volume of data sharing the same partition key should stay within the 10 GB limit.
See all articles by Marcin Policht