Very Large Databases and High Availability Evaluating Replication Options
November 8, 2006
Today's Database managers are more challenged than ever to find the appropriate tools and technologies for their mission-critical databases. When these databases are very large it requires a tremendous amount of effort to create a High Availability architecture that provides a continuously available solution that is robust and that works even during upgrades and maintenance with zero downtime. One of the challenges in designing a highly available Database infrastructure is examining and addressing all of the possible causes of downtime. Many of the world's critical databases that support mind-boggling transactions, whether it is EBay or Amazon, employ replicated database features, such that the failure of any component of a single database, including DISK, CPU, RAM or NETWORK, will not have a catastrophic failure on the entire system. In addition to replication, systems are developed to allow in-flight transactions to be automatically reconnected to another fully replicated database management system, which is most often located at an entirely different geographical location than the primary database. However, replication involves much more than just setup, as marriage involves much more than a honeymoon to survive, and besides designing the applications for in-flight transactions requires meticulous planning. This is where third party software will come into play. In this article, let us examine available replication options that are viable for your critical database.
Replication is the process of creating and maintaining a replicated version of database objects. Replication also improves performance by accessing a local database, rather than a remote server, to reduce the network traffic. Replication improves high availability of an application to function if parts of the distributed databases are down, since replicas of data might be accessible. Oracle server supports two forms of replication: Basic and Advanced replication. Basic replication is implemented using the standard create materialized view statement, which replicates only data; copies of this replicated data are read only. Oracle's advanced replication supports various configurations of updatable snapshot and update anywhere replication, but it is more difficult to configure. Advanced replication allows database objects like indexes and procedures, triggers, views and synonyms to be replicated; however, objects like sequences and data types like LOBS can't be replicated. In an asynchronous replicated environment, local updates are stored in a deferred transaction queue until the remote site becomes available and the transactions are applied when the remote site comes back up. Oracle also replicates the replication catalog or metadata that determines which objects are being replicated and how often their changes are propagated. This will ensure that there is no single point of failure in a replicated environment. If two sites change the same data within the data replication interval, you will have an update conflict. Oracle resolves the conflict by using the conflict resolution method configured when you first replicated that object. In addition, a DBA can resolve the conflict using DBMS_REPCAT package with procedures like ADD_UPDATE_RESOLUTION.
Oracle's Multi-master replication allows multiple sites, acting as peers, to maintain replicated master copies of objects. Each replicated site maintains a complete updateable copy of the replicated object. A DBA can configure participating sites as either Synchronous, or Non-Synchronous. These benefits do not come without a price. Configuring multimaster replication is a complex process. Its complexity is compounded by the need to implement conflict resolution processes, especially in the case of asynchronous propagation of changes. Larger installations could require a DBA to spend hundreds of hours configuring replication and may require a dedicated DBA to manage the environment.
Though the concept of Oracle replication is a good fit for certain applications (for example, to replicate a limited number of tables that are small in size), allowing many applications to depend on replication could be costly, due to less efficiency when large tables are replicated. Oracle's replication is relatively simple to setup, avoids 3rd party vendor complexities and it is free. The main disadvantage is that it does not support all data types, such as LOBs. Its complete refresh configuration for large tables (multi-million rows) is unacceptable. When configured for Fast Refresh, there is an increased risk of replication failure with large refresh volumes.
Therefore, in conclusion, Oracle's Replication may be a viable option only if used in a limited way, for a relatively smaller number of transactions. For Very Large Databases (VLDB) that require millions of records to be replicated in order to maintain a High Availability (HA) architecture, this may not be the ideal configuration. Let us discuss in the next phase if there any other ways to for a large number of tables to be replicated.
Oracle's Streams Replication:
Oracle 9i introduced a more flexible and efficient way of implementing replication using Streams. Contrary to what happens with Advanced Replication, there is no a Master Definition Site and all the sites play the same role. Every database involved in the Streams replication is in charge of capturing those changes involved in the replication and propagating them to the other sites. Oracle's Streams is a flexible, unified solution for information sharing across databases. Oracle Streams has three basic elements; capture, staging and consumption. These elements can be configured in different ways, depending on business requirements, so that replication of data takes place from one database to one or more databases even if the databases have different structure. One flexibility of streams is that it can handle database variations in hardware platforms or database releases or even charactesets easily.
In a nutshell, replication using Streams is implemented in the following way.
Another advantage of using streams for replication is that streams provide the ability to transform the stream. Oracle provides some easy to use declarative transformations, for the most common transformations such as changing the name of table at the destination. Besides, the data at the destination site can be subsetted based on the content. For example, you could implement a rule that only changes to the employee table, based on the department identifier column, be applied to particular table. Oracle Streams automatically manages these changes. However, main disadvantage of Streams is that it does not support all data types. If table has any single column with LONG or BLOB or ROWID datatypes, then Streams can't capture changes from that table. As a workaround, you can try creating a shadow table without the unsupported column at the source and replicate it. The DBA can populate it via triggers on the original table. Then set up Streams Replication between the shadow table and the destination table, possibly using rule-based transformations or DML handlers to resolve naming issues. Restrictions on Streams do not end there. There are many commands that Streams is not able to capture. Commands such as ALTER DATABASE, ALTER SYSTEM, SET ROLE, SET TRANSACTION, SET CONSTRAINT and some other commands that make changes to index-organized tables, changes involving user-defined types, are not captured. Oracle plans to extend the list of data types supported by Streams with each new release. Having discussed these limits in implementing Streams for Replication, it is worth discussing other third party replication products.
Veritas Database Replication:
With its focus on High Availability, Veritas Software makes storage management products that improve the availability of Oracle databases. One important approach Veritas employs for creating high availability systems is File System and volume-level replication. Veritas Software's replication products are VERITAS Storage Replicator for Volume Manager (SRVM) and the VERITAS Database Replication Option for Oracle (DRO). Although there are some functional differences between these products, they differ in both design and intent, and as such are suited for different replication applications. Let us discuss them briefly here, as implementation and troubleshooting of these products are not in the scope of this article. Please refer to Veritas knowledge base for installation.
The Database Replication Option for Oracle (DRO): It is used to replicate production data periodically to a secondary system at file system level. This replicated Database system can then be used for off-host processing, such as backups, data mining, reporting and decision support database systems. The Database Replication Option requires Veritas Database Edition for Oracle software packages to be installed on the primary and target (replicated) database servers. After an initial, full replication, DRO uses VERITAS' unique Block Level Incremental (BLI) technology to track and replicate only changed file system blocks between the primary and the secondary system. These replication operations occur at periodic intervals designated by the System Administrator On the secondary (replicated) system, database administrators can use Storage Checkpoint and Storage Rollback operations to make changes to the replicated data and then undo the changes. This means that you cannot only read the replicated data, you can write to it. This is very helpful for testing any production changes before we rollout them into the primary database server. Resynching the data is quick and does not downgrade the production Database server's performance. DRO supports occasional replication - it may be as frequent as several times a day. Another important aspect of this product is the DBA can update the replicated data (for testing or training purposes) then roll back to the previous image.
The Storage Replicator for Volume Manager (SRVM): This Veritas product is a real-time, ongoing data replication solution at Logical Volume level. This means that replication is completely transparent to the applications involved. Storage Replicator for Volume Manager is implemented as a Veritas Volume Manager (VxVM) module. Whenever data is written to the replicated volume, SRVM automatically sends it to one or more sites (depending on the configuration). SRVM works over any IP network, LAN or WAN, supporting up to 32 nodes in a replicated network. One of the advantages of SRVM is that the integrity of data on all systems is maintained through a variety of techniques, including write order fidelity and In-Band Control.
It is an ideal solution for applications requiring up-to-the-minute data replication.
Quest software has been providing an Oracle Data replication product since 1997 and it is called SharePlex. Many enterprises deploy this product. SharePlex uses the transaction details in the redo log files for the data replication. It does not use the database engine to replicate, there by reducing the load on the primary database server. As soon as the transactions are written to redo logs, the SharePlex module extracts them. Thus, it can update the changes at the destination site very quickly. Therefore, it is a real-time solution that replicates transactions as they occur without waiting for the commit. SharePlex software provides high-speed, log-based replication between Oracle instances. As stated earlier, SharePlex updates the target database in real-time, providing a reliable copy of the production instance that can be used for reporting, queries, extracts, backups, and high availability. If a transaction on the primary database is cancelled, Share-Plex replicates the rollback so that the target database is an accurate representation of the source database.
As changes are made to primary database, SharePlex continuously replicates them to an Oracle
Instance on the target (replicated) system. When target data is identical to source data, taking into account varying differences caused by delays in transport over a network, the two sets of data are synchronized. A source system can have many target systems. In addition, a source system can simultaneously serve as a target system, receiving data from other source systems. Likewise, a target system can receive data from many source systems, and it can simultaneously serve as a source system, sending data to other target systems.
The advantage of SharePlex for Oracle is its tolerance for outages. If the target Oracle database is unavailable, SharePlex queues data on the target system, allowing transactions to accumulate until an Oracle connection can be re-established. If the target system itself is unavailable, or if there are network problems, SharePlex stores the transactions on the source system until operations are restored. The DBA can control when the transactions are applied to a target system or the DBA can stop the post process. When ready for the transactions to be applied, the DBA can start the posting process again.
SharePlex enables a DBA to customize replication to specific needs. For example, the DBA can replicate an entire table, or a subset of its data (columns) beyond a firewall while protecting other, more sensitive data. The DBA can replicate different records to different locations and can configure SharePlex to interact with PL/SQL procedures that transform data before, or instead of, posting it to a target database.
How SharePlex works
SharePlex replicates data using a set of replication processes; Capture, Read, Export, Import and post. These processes start automatically, as they are needed, but they can be controlled with commands as well. Let us discuss these processes to understand how SharePlex replication actually works.
Capture Process: The Capture process reads the redo logs or archive logs on the source systems for changes to objects. Capture writes the data to the capture queue where it accumulates until the next SharePlex process is ready for it. There is a separate Capture process for each data source being replicated.
Read Process: The Read process operates on the source system to read data from the capture queue and add routing information to the data; after processing the data it sends it to the next queue. There is a separate Read process for each data source.
Export Process: The Export process operates on the source system to read data from the export queue and send it across the network to the target system. By default, a source system has one Export process for each of its target systems. If there are two target systems, there are two Export processes.
Import Process: The Import process is the second half of the Export/Import transport pair, operating on the target system to receive data and build a post queue. If there are two source systems replicating to a target system, there are two Import processes
Post Process: The Post process operates on the target system to read the
post queue, construct SQL statements for replicated operations, and apply them to
target objects. There is a Post process for each post queue on a target system, and
multiple Post processes can operate simultaneously on a system.
SharePlex determines which transactions/objects to replicate based on the information provided in the configuration file. SharePlex creates one or more messages that are sent from Capture to the Read process and from Read to all of the other replication processes in sequence. A message can reflect a SQL operation like INSERT, UPDATE, DELETE, and COMMIT, TRUNCATE. Large operations, like those on LONG or LOB columns, can require more than one message because a message has a size limitation. Other operations, such as array inserts of small records, have the inverse effect: There could be one record for numerous operations. For example, an array insert of 100,000 rows might be accommodated with only 1000 messages, depending on the data. On the target system, the Post process receives the messages and constructs standard SQL statements to post replicated changes to the target database.
Much of this process is controlled by internal tables that were created during the installation of SharePlex. The installation process of SharePlex on the source system and target system takes an hour or two and it is relatively simple. It is easy to administer and troubleshoot.
Customers looking for high-performance database replication for Oracle Database systems, requiring support for thousands of rows per second for load distribution, disaster recovery, migrations or for off-loading reporting should consider SharePlex for their mission critical database that support intensive transactions.
This article discusses various Replication technologies. IT managers need to plan how to use replication over the next several years before choosing a tool or developing their own tool using database features like streams.
Some factors to consider: How are replication needs expected to grow? Is your company considering using replication for situations such as customer service, running reports or database testing or Disaster Recovery? How often and how fast will you need to refresh the data in these alternate applications? Do your users agree to long down times while you upgrade the database server or migrate to different platform?
Issues to consider when choosing third party tools:
Replication is more than just backup, and replication management software is becoming more necessary in a world that's increasingly driven by information. Companies should examine whether this technology is the right choice for them.
The wise one once said: Replicate early and often... Infrastructure crashes can happen at any time for any reason!