dcsimg

Import XML Data Into Microsoft Access

February 20, 2004

Back in November 2003, I was assigned the task of creating an import routine that would refresh data in an Access 2000 database from an array of XML files.  For some reason, I thought it would be easy, as in "one touch import" or something.  As it turns out, it is not that easy.

I searched the Internet for suggestions, but found little.  I am not sure what I was hoping for, but I thought there must be some kind of example out there, or maybe even a third party utility to simplify the process.  I even asked our local Access MVP for a quick fix, but he assured me that I would have to write a custom import routine from scratch.

Well, I did slog through the code, all the while vowing to turn it into an article someday.  What follows is the product of that vow.  If you have been searching the Internet for an Import XML into Access solution, then congratulations.  If not, and you just stumbled upon this article by accident, then bookmark this page.  The way XML is being used these days, it is just a matter of time before you'll be needing it.

The Problem Defined

The project that inspired this article is the quintessential example of where XML is useful.  My client had outsourced their help desk management to a company using non-Microsoft technologies, hosted off-site, behind a firewall.  They wanted metrics reports, but the help desk web interface had no such provision.  As a compromise, an arrangement was made to have daily data dumps sent to my client so that they could create their own reports, using Microsoft Access.

Each night, help desk data files were zipped up and FTP'd to our site. One XML file would be provided for each PROBLEM ticket modified the previous day.  The Microsoft Access import utility I created would run on schedule, looking for any new zip files, unpacking them and processing each XML file in turn.  The associated download for this article contains only the code for importing the XML. If you are looking for code to unpack a zip file using VBA and a third party DLL, drop me a line and I will send it to you.

WARNING:  Each XML import will be unique.  The download for this article is an example, not an import utility.  At best, all you will be able to leverage from the downloaded code is some simple file looping routines and the general process idea.

One of the caveats of working with XML is that there needs to be agreement, a contract of sorts, between the parties exchanging data. The structure or schema of the data should not change, or else the import process will break. For the example considered in this article, the following schema was agreed upon:

XML file schema example 

You will notice in the schema image above that each child of the PROBLEM node contains child node(s) named DATA.  While this XML is well formed, the convention of naming all lower level child nodes DATA is a bad practice.  I loaded one of these XML files into Visual Studio .Net to see if I could create a schema against which incoming XML could be validated, but I received the following error message.
 
XML Schema error message 

I am sure that by labeling this a "bad practice" I am going to hear from non-Microsoft programmers, explaining why this is wholly justified.  Quite frankly, I will not be able to defend myself.  I do not know that much about XML.  What I do know is that I had wanted to use the DOM to validate incoming XML files against a predefined schema, but the construction of this XML made that difficult, if not impossible.  I guess my point is that what you see in this example is probably the worst-case scenario of what you should expect when importing XML data in to Access.

Additionally, it may not be obvious from the above illustration but each PROBLEM has one and only one parent record whose attributes are stored in the PROBLEM_DETAIL node. For each PROBLEM_DETAIL record, there are one or more child records in each of the following nodes:  CALL_NOTES, PROBLEM_NOTES, OUTAGES, RCA, PROBLEM_AUDIT_TRAIL and PROBLEM_HISTORY.

I started by creating the Microsoft Access tables and their relationships, with cascading deletes to all the foreign key tables.  The mapping between Node names and Table names was necessary to conform to my client's naming standards.  You could, of course, match table names to node names, as well as field names to child node names.  The code for this example aliases everything, which is a pain, but makes the solution more flexible.  Below is a summary of the XML node names, their corresponding Access table and the Access relationship diagram of the finished table schema. 

XML Node Name

Access Table Name

 PROBLEM_DETAIL

 PRBLMS

 CALL_NOTES

 PRBLM_CALL_NTS

 PROBLEM_NOTES

 PRBLM_NTS

 OUTAGES

 PRBLM_OTGS

 RCA

 PRBLM_RCAS

 PROBLEM_AUDIT_TRAIL

 PRBLM_ADT_TRL

 PROBLEM_HISTORY

 PRBLM_HSTY

Table relationship diagram of Access database

 
One last project requirement that influenced my code below is that incoming help desk records may or may not have a corresponding record already stored in my Access database.  For example, say a ticket was opened on Tuesday, handled on Wednesday and closed on Thursday.  For the sake of my project, this would mean that I would get an XML file for this issue on Wednesday, Thursday and Friday, each describing its status on the previous day.  My approach to this requirement was first to perform a delete to remove any existing record before inserting the latest values.  I found this easier than trying to identify which columns had changed and updating an existing record.  For my needs, this worked fine, but you will have to consider how you want to approach this question. 








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers