Data Formatting, Working with XML Data in MySQL

In the “Working with XML Data in MySQL” article, we explored some of
MySQL’s XML capabilities, including the –xml command
line option, which provides a means of formatting query output as XML, as well
as a couple of new XML-specific functions that
were introduced in MySQL 5.1.5. In today’s
article, we’ll continue to explore XML support in MySQL. Specifically, we’ll
look at a nicer way to format our resultset so that the proper headers appear
in the column headings, rather than the supplied expressions. We’ll also
expand the last example of the previous installment to retrieve all the records
from the XML document, using a stored procedure.

Adding Our Own Headers to ExtractValue’s Output

If you
recall, our final example of the last article imported the contents of an XML
file into a variable using the LOAD_FILE() function and then
selected some fields from it using the ExtractValue() function:

CREATE TEMPORARY TABLE client_citizenship (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
xml_data TEXT NOT NULL
);

SET @xml = LOAD_FILE(“c:\client_citizenships.xml”);

INSERT INTO client_citizenship VALUES (NULL, @xml);

SELECT xml_data FROM client_citizenship LIMIT 1 INTO @xml;

SELECT ExtractValue(@xml, ‘//row[2]/field[1]/@name’),
ExtractValue(@xml, ‘//row[2]/field[1]’);

The last
SELECT statement extracted the first <field> of the second <row>
node:


</row>

<row> (row[2])
<field name=”client_id”>2</field> (field[1])
<field name=”date_of_birth”>1944-01-15</field>

This
produced the following resultset:

+———————————————-+—————————————-+
|ExtractValue(@xml, “//row[2]/field[1]/@name”) |ExtractValue(@xml, ‘//row[2]/field[1]’) |
+———————————————-+—————————————-+
|client_id |2 |
+———————————————-+—————————————-+

Notice that MySQL uses the expressions as the column headers. To format
the resultset with the field name in the header, we need to use a Prepared Statement. A
Prepared Statement is a
special type of SQL statement that has been precompiled. This means that when
the Prepared Statement is
executed, the DBMS can just run the Prepared
Statement
SQL without having to compile it first. Therefore it
normally reduces execution time to use a Prepared
Statement
over a regular query when you want to execute the same
statement many times. In this instance, the advantage to using a Prepared Statement is that we can set
the header in one statement, and then concatenate it to our select string:


SET @header = SELECT ExtractValue(@xml, “//row[2]/field[1]/@name”);
SET @qry = CONCAT(“SELECT ExtractValue(@xml, “//row[2]/field[1]”) AS “, @header, “;”);
PREPARE stmt FROM @qry;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

Now our
data is formatted how we’re used to seeing it:


+———–+
| client_id |
+———–+
| 2 |
+———–+

Using the ExtractValue() Function Within a Stored Procedure

One
obvious limitation to the above approach is that the rows are hard-coded so,
unless you’re after specific values, it won’t be of much use for more generic
tasks, such as importing an entire XML document into a table. For that, we
need to use a stored proc.

MySqlDump

The MySqlDump client is a backup program originally written by Igor
Romanenko. It can be used to dump a database or a collection of databases for
backup or transfer to another database (MySQL or other). The dump typically
contains SQL statements to create the table, populate it, or both. However,
MySqlDump can also be used to generate files in CSV, other delimited text, or
XML format, which is what we’ll be doing here.

Although
there is more than one way to run MySqlDump, the syntax for what we want to do
is as follows:


mysqldump –xml databasename [tables]

You can
also extract to a file by using the standard UNIX/DOS method of outputting to a
file. Here is the command that I used to convert my client table into an XML
file:


C:>mysqldump –xml temp_table_article client > c:\sqldump.xml

Looking a the sqldump.xml, we can see that the MySqlDump client
includes a lot more information than was produced by the –xml flag alone.
Whereas it simply outputed each row, field names and values, MySqlDump
separates the output into “table_structure” and “table_data” sections:


<?xml version=”1.0″?>
<mysqldump xmlns_xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<database name=”temp_table_article”>
<table_structure name=”client”>
<field Field=”client_id” Type=”int(10) unsigned” Null=”NO” Key=”PRI” Extra=”auto_increment” />
<field Field=”date_of_birth” Type=”date” Null=”YES” Key=”” Extra=”” />
<field Field=”gender” Type=”char(1)” Null=”YES” Key=”” Extra=”” />
<field Field=”logical_delete_indicator” Type=”tinyint(1)” Null=”NO” Key=”” Default=”0″ Extra=”” />
<key Table=”client” Non_unique=”0″ Key_name=”PRIMARY” Seq_in_index=”1″ Column_name=”client_id”
Collation=”A” Cardinality=”4″ Null=”” Index_type=”BTREE” Comment=”” />
<key Table=”client” Non_unique=”0″ Key_name=”client_id” Seq_in_index=”1″ Column_name=”client_id”
Collation=”A” Cardinality=”4″ Null=”” Index_type=”BTREE” Comment=”” />
<options Name=”client” Engine=”InnoDB” Version=”10″ Row_format=”Compact” Rows=”4″ Avg_row_length=”4096″
Data_length=”16384″ Max_data_length=”0″ Index_length=”16384″ Data_free=”10485760″
Auto_increment=”5″
Create_time=”2009-10-05 17:51:34″ Collation=”latin1_swedish_ci” Create_options=””
Comment=”” />
</table_structure>
<table_data name=”client”>
<row>
<field name=”client_id”>1</field>
<field name=”date_of_birth”>1976-02-12</field>
<field name=”gender”>M</field>
<field name=”logical_delete_indicator”>1</field>
</row>
<row>
<field name=”client_id”>2</field>
<field name=”date_of_birth”>1944-01-15</field>
<field name=”gender”>F</field>
<field name=”logical_delete_indicator”>0</field>
</row>
<row>
<field name=”client_id”>3</field>
<field name=”date_of_birth”>1956-06-04</field>
<field name=”gender”>M</field>
<field name=”logical_delete_indicator”>1</field>
</row>
<row>
<field name=”client_id”>4</field>
<field name=”date_of_birth”>1938-11-19</field>
<field name=”gender”>F</field>
<field name=”logical_delete_indicator”>0</field>
</row>
</table_data>
</database>
</mysqldump>

The
reason that we used this tool to create the XML data structure is because there
are already stored procedures available to insert MySqlDump XML data into a
table. One such proc was written by MySQL developer Alexander
Barkov. His xmldump_load
procedure extracts data from the XML elements and attributes found in a file
created using mysqldump –xml,
and inserts this data into the columns of a MySQL table. Here is the full
source of this stored procedure:


DELIMITER |

DROP PROCEDURE IF EXISTS xmldump_load |

CREATE PROCEDURE xmldump_load( file_name VARCHAR(128),
database_name VARCHAR(128),
table_name VARCHAR(128))
BEGIN
DECLARE xml TEXT;

DECLARE nrows INT;
DECLARE rownum INT DEFAULT 1;

DECLARE ncols INT;
DECLARE colnum INT DEFAULT 1;

DECLARE ins_list TEXT DEFAULT ”;
DECLARE val_list TEXT DEFAULT ”;

DECLARE tmp VARCHAR(255);

# load the XML file’s contents into a string
SET xml = LOAD_FILE(file_name);

# get the number of <row>s in this table
SET nrows = ExtractValue(xml,
‘count(/mysqldump/database[@name=$database_name]/table_data[@name=$table_name]/row)’);

# get the number of <field>s (columns) in this table
SET ncols = ExtractValue(xml,
‘count(/mysqldump/database[@name=$database_name]/table_data[@name=$table_name]/row[1]/field)’);

# for each <row>
WHILE rownum <= nrows DO

# for each <field> (column)
WHILE colnum <= ncols DO
SET tmp = ExtractValue(xml,
‘/mysqldump/database[@name=$database_name]/table_data[@name=$table_name]/row[$rownum]/field[$colnum]/@name’);
SET ins_list = CONCAT(ins_list, tmp, IF(colnum<ncols, ‘,’, ”));
SET tmp = ExtractValue(xml,
‘/mysqldump/database[@name=$database_name]/table_data[@name=$table_name]/row[$rownum]/field[$colnum]’);
SET val_list = CONCAT(val_list, ””, tmp ,””, IF(colnum<ncols, ‘,’, ”));
SET colnum = colnum + 1;
END WHILE;

SET @ins_text = CONCAT(‘INSERT INTO t1 (‘, ins_list, ‘) VALUES (‘, val_list, ‘)’);

SET ins_list = ”;
SET val_list = ”;

PREPARE stmt FROM @ins_text;
EXECUTE stmt;

SET rownum = rownum + 1;
SET colnum = 1;
END WHILE;
END |

DELIMITER ;

Examine
the code carefully and you’ll see that the xmldump_load proc uses the same XML
tools as we’ve used so far, such as the LOAD_FILE() and ExtractValue()
functions. What it adds are two nested While
loops to iterate through every row and column.

Here are
the steps that one would take to import the sqldump.xml file that we created
earlier:

The
first thing we need to do is import the xmldump_load proc into MySQL if this is
the first time that we use it:


C:> mysql db_name < text_file

C:> mysql temp_table_article < c:xmldump_load.sql

You can
also load the proc from within the MySql client using the source or . commands:


mysql> source c:xmldump_load.sql;

OR

mysql> . c:xmldump_load.sql;

The proc
accepts the three following input parameters:

the XML
input file name

the
database name to write to

the
table name to create

Now we
can supply the sqldump.xml file that we created earlier, along with a database
and table name:


mysql> call xmldump_load(‘c:sqldump.xml ‘, ‘client_info’, ‘client’);

That will insert the contents of the sqldump.xml file into the client
table of the client_info database.

Coming Soon…

A new SQL statement has been accepted for MySQL 6.0, and will be
available in version 6.0.3. LOAD XML greatly simplifies the task of importing data from an XML file into a
MySQL table, without having to use the stored procedure discussed previously. At
the time of this writing, version 5.4 is in Beta, so you’ll have to wait a bit
for that.

»


See All Articles by Columnist
Rob Gravelle

Robert Gravelle
Robert Gravelle
Rob Gravelle resides in Ottawa, Canada, and has been an IT guru for over 20 years. In that time, Rob has built systems for intelligence-related organizations such as Canada Border Services and various commercial businesses. In his spare time, Rob has become an accomplished music artist with several CDs and digital releases to his credit.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles