How to Merge Records from Two Identical Access Databases

A while back, I replied to a post at the Comp.Databases.MS-Access
newsgroup with the subject line of "What’s the simplest way to merge two
databases?" The user had attempted to bring in the data using the File |
Get External Data menu option, but he discovered that all he accomplished was
to import his tables. That is a far cry from merging or combining data and
understandably so. A properly designed database includes primary and foreign
keys and oftentimes, auto generated record ID numbers. As we will see, you
cannot just append these records to a single table.

The process of merging this kind of relational data can get
quite complicated, especially when you have numerous core-data tables as well
as lookup tables. While every case has its own peculiarities, I will explain
the basic steps using an employee time tracking application table schema.

Imagine the program has been deployed at two separate sites
with two separate back end mdb data files. Some tables contain unique data,
such as tblEmployee and tblTimeRecord, and some store not so unique data such
as tblStatus.

Let’s assume that each table has an autonumber ID field
which starts counting from 1. Both sites will have their own employee and time
records from 1 to n.

If you merge Employee records from Site B into Site A, you
will have to renumber the Site B records so their values don’t collide with any
Site A ID numbers.

If you change the EmployeeID for Site B, you will have to
cascade those changes to other related Site B tables, such as tblTimeRecord.
How do you do this?

NOTE: Before beginning, back up your data and always work
with a copy of the actual production data!

Consider these necessary changes to Site B database…

  1. Remove the autonumber from EmployeeID field in tblEmployee
    for Site B.

  2. Store largest EmployeeID from Site A in variable named lngMaxSiteA_ID.

  3. Make sure that Cascade Updates are turned on for every
    relationship

    in the Site B database that has a relation on EmployeeID.

  4. Run a simple UPDATE query to add the value, lngMaxSiteA_ID,
    to EmployeeID for every record at Site B tblEmployee. The Cascade Updates
    should propagate those changes throughout your database.

  5. Append all Site B records from tblEmployee to the Site
    A tblEmployee.

    (Even though Site A still has an autonumber on EmployeeID,
    the records will go in without error and the correct ID numbers will be
    assigned.)

  6. Repeat this process for every table with an autonumber
    ID.

That was easy, now wasn’t it? However, we have not gotten
to the hard part yet.

The REAL problem comes in when dealing with ‘lookup’ tables,
like the one I named tblStatus. Let’s say that each task in tblTask has an
associated status.

If each database (Site A and Site B) is installed with a
preset list of statuses, then there is no problem, but if the end users can
enter their own values, you will end up with something like this:


Site A: tblStatus
——————-
1 Open
2 Completed
3 Cancelled
4 Needs Approval

Site B: tblStatus
——————-
1 Completed
2 Open
3 Cancelled
4 Pending

In this case, you will need to create a mapping table to
show how these two status tables relate to each other. This tblMapStatus is
created in the Site B database and looks like this.


OldID (B) NewID (A) Status
————————————-
1 2 Completed
2 1 Open
3 3 Cancelled
4 Null Pending

See the problem here?

Site A has a status that we do not see in Site B. That is
not a problem from one aspect, namely that Site B does not use the status
‘Needs Approval’ so we do not need to map to it. But we do need its ID number,
#4, for our ‘Pending’ status.

Therefore, you need to go back to tblStatus at Site A, add
the missing status (‘Pending’) and make note of its newly assigned ID number.
In our example, that number is 5, so we update our tblMapStatus accordingly.


OldID (B) NewID (A) Status
———————————
1 2 Completed
2 1 Open
3 3 Cancelled
4 5 Pending

Now we will have to update our database at Site B by running
an UPDATE query…


UPDATE tblStatus
INNER JOIN tblMapStatus
ON tblStatus.ID = tblMapStatus.ID
SET tblStatus.ID = tblMapStatus.NewID

Once again, this process must be repeated for all
helper/lookup tables and where necessary, cascade updates must be turned on so
that changes imposed by the mapping tables will be propagated throughout the
database. When finished, the mapping tables, as well as the lookup tables such
as tblStatus can be discarded since their counterpart in the Site A database
now contains all possible values.

Well, that describes the basic process but as you can see, it
is a lot of work, and a lot of custom coding. I created a tool for one client
that was specifically designed to help their customers merge copies of the
database. My utility automated the process for this particular application,
even providing an interface to assign new values for lookup tables like tblStatus
described above. While my utility will not merge your data, it may help you
develop your own tool to merge your databases.

If you would like to see my code, drop me a line at
datafast@comcast.net.

»


See All Articles by Columnist
Danny J. Lesandrini

Danny Lesandrini
Danny Lesandrini
Danny J. Lesandrini currently works as the IT Director for Pharmatech Oncology Inc. at http://www.pharmatechoncology.com/. He holds Microsoft Certifications in Access, Visual Basic and SQL Server and has been programming with Microsoft development tools since 1995.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles