by Alex Polishchuk
One of the issues that companies are facing in their operations is database migration to new hardware, or a different operating system platform, or even a new database vendor. There are many steps and factors involved in this process. This article will describe the overall process and interviews with companies that went through this process.
According to a recent Bloor Research survey, the data migration market is estimated to be over $5 billion and growing. Companies perform their database migration in many different ways, such as extract-transform-load (ETL), replication, and manual scripts. This task gets more complex as the data volume grows on one hand and available downtime is shrinking on the other one.
Statistics presented in Data Migration in the Global 2000 by Bloor Research in September 2007 shows the following picture:
- 16 percent of data migration portion of projects were successful
- 37 percent had budget overruns
- 64 percent were not delivered on time
Migration has to be carefully planned and executed to minimize production downtime and maintain data integrity and database performance, because many companies have locations worldwide and have to be operational 24x7.
Our interviews with people from a wide range of industries on the topic of database migrations have shown that legacy system migrations are one of the top reasons to switch over to new systems according to the interviewees. Total cost of ownership (TCO) of new systems is typically lower as out-of-date systems are more expensive to maintain. Some companies perform database migration due to the equipment lease ending and being replaced by a newer system. Once migration is complete, the target system will be up-to-date and supported by hardware and software vendors. The new system will have a wider range of features and will be more environmentally friendly.
Since databases are more difficult to migrate than a set of files, the migration process and resource allocation have to be carefully planned and customized to the specific environment to minimize production downtime.
There are various types of database migration such as:
- Database Upgrade
Installing the latest database release
- Hardware Upgrade
Moving to a newer hardware / software release on the same platform
- New Platform
Moving to the different hardware / software platform
- OS Upgrade
Upgrading OS release on the current system
All these types of data migration are independent of each other, but some are often combined together to make it more efficient and to take advantage of the overall process. One of the most common types of migration is the database upgrade to the latest release. This is usually done in conjunction with hardware upgrade and / or OS upgrade. Any of these data migration paths involve installation of the new database server on the target system. Usually, it is the latest database release.
Most of the interviewees said that the source and the target platforms are different. One representative of a global provider of enterprise-class solutions and technologies said their target system is Windows, while the source is a combination of Windows, Linux, and UNIX. Mohican Laine from Overstock.com mentioned that their target platform is 64-bit Linux, while the source one is 32-bit. According to Daniel Suciu at Vodafone target and source systems had different operating systems and even databases.
Regardless of the migration type, there are common steps to be performed to ensure a successful migration. These steps are:
During the planning phase, the source and target systems are analyzed for the amount of data to be transferred, the time it takes, available system downtime to perform the migration, security, system availability and roll back options. There may be other factors specific to the current business requirements, such as government and industry regulations. The plan needs to have as much detail as possible down to the executed commands.
The time required to transfer data to the new system is often overlooked, but it is one of the most time consuming operations. The maximum data transfer rate using FTP protocol over a 100 Mb/sec network is not sufficient these days when database size goes into hundreds of gigabytes or more. A gigabit network or faster with jumbo frame support is required to transfer large databases in the typical time allowed. Also, if the data is transferred via SSH protocol, which is often the case these days due to security concerns, then it will be even slower because of encryption overhead.
Another important aspect is source and target database compatibility. This depends on operating system platforms involved, database vendors, and release levels. For instance, migrating an Oracle database from release 9.2 to 10g on the same platform has fewer ways of accomplishing it, than from 10g to 11g. Cross platform or database vendor is even more limited and usually means exporting data to a file and then importing it or using replication or gateway technology.
- Target preparation
The target system needs to be installed and configured to become the primary one after the migration is complete. All the necessary patches have to be applied. This is done according to the hardware and software specifications and any specific business requirements, such as security, backup, and others. Applications will have to be pointed to the new system, which has to be coordinated with network and system administrators.
This step involves the testing of the data migration plan. Usually, this is done on a set of systems that are not part of the production environment, to prevent any business disruption. For instance, those systems may be development on the source side and the new system on the target side. During this step both systems are monitored for the time it takes to complete the migration and resource utilization. The target system is verified upon the successful test completion to ensure the data integrity, accessibility, and system performance.
Ideally, the data volume should be very close to the production environment, if not a backup of that environment. This is necessary for proper database performance optimization and tuning, because different database releases may exhibit rather different behavior in this respect. Data statistics must be collected on the target system on all levels to ensure the proper resource utilization, since data transfer rate and access times will be different between the systems.
Any necessary corrections have to be reflected in the plan. Testing may have to be repeated several times due to the high system complexity.
The production data migration is scheduled and performed according to the plan, once the testing has been successfully completed. If the production data can be migrated during the downtime allowed, then the whole process is much easier to manage, otherwise additional steps have to be performed to ensure the successful completion.
The roll back option has to be ready to go in case the new system fails for any reason and there is no more time left to correct the problem.
Be sure to collect data statistics as mentioned before and set the necessary user and object level privileges. Both of these are among the most common issues with the target system.
Similar to the test migration, data verification has to be done on the target system. It ensures the data integrity and optimal system performance. Both systems may have to be running in parallel before the old one is decommissioned. In this case, the data flow has to be setup between both systems to make them identical. This is usually done with the target system configured as master and the old one as slave, so the data flows from the new one to the old. Replication is the ideal tool for this scenario. In case of bi-directional data flow (active-active/multi-master) there have to be mechanisms in place for conflict detection and resolution.
Additional tasks may have to be performed depending on the type of the migration. These tasks may be creating the new database structure when moving to a different database platform or data cleansing to remove obsolete or redundant information. Data cleansing and consistency were the top challenging issues named by the respondents among compatibility and time constraints.
There are also third party solutions, such as GoldenGate Software that provides a transactional Data Management (TDM) platform to enable high volumes of transactional data to be moved continuously between enterprise systems with sub-second speed, preserved data integrity, and with minimum impact to those systems.
Overstock.com selected GoldenGate Software to move customer data from their Oracle 9i databases supporting their retail site into the Teradata Warehouse running the Teradata V2R6 database. Mohican Laine said that they evaluated Quests Shareplex software and also Oracles embedded database replication software. We went with GoldenGate due to implementation simplicity and ease of use.
Another solution is by Informatica that offers Data Migration Suite. It is a set of several products that can be used for various scenarios and has the option of a perpetual license or project-bound license. The project-bound license is limited in term to 6, 12, or 18 months.
Celona offers data migration software that can be used in various migration scenarios, such as event based, incremental, bulk load and others.
Danes John said that when working for Vaman Technologies they analyzed most of the migration tools available, so in some cases we can use third party tool and [in] others we developed in-house.
There are other vendors that provide similar solutions that may be more applicable in your situation. These solutions can assist with various data migration scenarios and reduce downtime or eliminate it entirely.
Database migration is a complex project that has to be thoroughly analyzed. Determine the migration type and the necessary tools and resources to complete it. Various approaches have to be fully reviewed and understood prior to choosing one.
Migration can be done in-house or outsourced to a consulting practice. Consider using third-party solutions to prevent hand-coding migration scripts that will be time consuming and rather complex if data cleansing is involved. In addition, these solutions can reduce or eliminate the downtime and facilitate the replication between the systems. Outsourcing will bring in the best practices and appropriate tools to complete the project.
About the author
Alex Polishchuk is the founder and president of Advanced Computer Consulting (www.advcomputerconsulting.com) that provides database consulting services. Alex has over fifteen years of professional experience administrating, designing, developing, and implementing database applications in various industries and companies ranging from small to Fortune 50 corporations. Alexs primary areas of expertise are in database security and performance optimization and tuning. He can be contacted at firstname.lastname@example.org.