XML was created to facilitate the
transportation of data between systems and businesses in a language nonspecific
way. When first introduced, a script or application process had to parse the
XML data and convert it into valid format for the database and underlying
system. Later, databases started to support XML so that no conversion was
necessary. Today, well look at XML support in MySQL and learn how to import
and export data in XML format.
Exporting Data as XML Using the --xml Option
Prior to version 5.0, MySQL XML support was
limited to exporting data in XML format using the MySQL command line client. Executing
a command or query using the --xml or -X option told the MySQL client utility to produce the output as XML data.
For instance, the following line would output all of the database variables
that begin with the string version:
C:\>mysql -u <userid> -p<password> -e
"SHOW VARIABLES LIKE '%version%'" -xml
The resulting XML contains the following
standard format:
-
The entire dataset is contained within the
<resultset> node.
-
Each row is contained within a <row> node.
-
The columns are contained within a <field>
node.
-
The column name is stored in the name attribute.
-
The column value is the <field> text node.
<?xml version="1.0"?>
<resultset statement="SHOW VARIABLES LIKE '%version%'"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<row>
<field name="Variable_name">protocol_version</field>
<field name="Value">10</field>
</row>
<row>
<field name="Variable_name">version</field>
<field name="Value">5.1.30-community</field>
</row>
<row>
<field name="Variable_name">version_comment</field>
<field name="Value">MySQL Community Server (GPL)</field>
</row>
<row>
<field name="Variable_name">version_compile_machine</field>
<field name="Value">ia32</field>
</row>
<row>
<field name="Variable_name">version_compile_os</field>
<field name="Value">Win32</field>
</row>
</resultset>
For comparison, here is the above output again
as a tabular resultset:
+-------------------------+---------------------+
| Variable_name | Value |
+-------------------------+---------------------+
| protocol_version | 10 |
| version | 5.1.22-beta-debug |
| version_comment | Source distribution |
| version_compile_machine | x86_64 |
| version_compile_os | suse-linux-gnu |
+-------------------------+---------------------+
The same syntax can be applied to any SELECT
statement. In my last article, entitled Make Your MySQL Queries More
Efficient with Temporary Tables, the initial query populated the
client_citizenship temporary table with records whose logical_delete_indicator
was zero:
INSERT INTO client_citizenship
SELECT cl.client_id,
cl.date_of_birth,
cl.gender,
cit.citizenship_id,
cit.country_code,
cit.primary_citizenship
FROM temp_table_article.client AS cl,
temp_table_article.citizenship AS cit,
temp_table_article.client_citizenship_rel AS rel
WHERE cl.client_id = rel.client_id
AND cit.citizenship_id = rel.citizenship_id
AND cit.logical_delete_indicator = 0
AND cl.logical_delete_indicator = 0;
Performing a SELECT * on the temp table
returned the following records:
client_id, date_of_birth, gender, citizenship_id, country_code, primary_citizenship
2, 1944-01-15, F, 4, 20, 0
2, 1944-01-15, F, 7, 77, 1
As long as we logged in using the --xml flag, a
SELECT * query will return the following XML-formatted results:
<?xml version="1.0"?>
<resultset statement="SELECT * FROM client_citizenship"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<row>
<field name="client_id">2</field>
<field name="date_of_birth">1944-01-15</field>
<field name="gender">F</field>
<field name="citizenship_id">4</field>
<field name="country_code">20</field>
<field name="primary_citizenship">0</field>
</row>
<row>
<field name="client_id">2</field>
<field name="date_of_birth">1944-01-15</field>
<field name="gender">F</field>
<field name="citizenship_id">7</field>
<field name="country_code">77</field>
<field name="primary_citizenship">1</field>
</row>
</resultset>
One disadvantage of the --xml flag is that the
mapping from relational data to XML is fixed. Hence, there is no way to modify
the output. If youre willing to do some string manipulation, you can dispense
with the --xml flag entirely and produce your own XML code.
Continuing with the previous query, lets see
how we could alter the output XML so that it conforms to the following format:
<client>
<client_id>2</client_id>
<date_of_birth>1944-01-15</date_of_birth>
<gender>F</gender>
<citizenship_id>7</citizenship_id>
<country_code>77</country_code>
<primary_citizenship>1</primary_citizenship>
</client>
One way to do it would be to use the CONCAT() and GROUP_CONCAT() string functions:
mysql>SELECT CONCAT('\n<client>\n',
->GROUP_CONCAT('<client_id>', client_id, '</client_id>\n' SEPARATOR ''),
->GROUP_CONCAT('<date_of_birth>',date_of_birth,'</date_of_birth>\n' SEPARATOR ''),
->GROUP_CONCAT('<gender>',gender,'</gender>\n' SEPARATOR ''),
->GROUP_CONCAT('<citizenship_id>',citizenship_id,'</citizenship_id>\n' SEPARATOR ''),
->GROUP_CONCAT('<country_cd>',country_cd,'</country_cd>\n' SEPARATOR ''),
->GROUP_CONCAT('<primary_citizenship>',primary_citizenship,'</primary_citizenship>\n' SEPARATOR ''),
->'</client>') AS xmldoc
->FROM client_citizenship\G
Although we could get the job done that way,
its not necesarily easier than using a scripting or programming language to
perform the formatting, depending of course on your familiarity with
programming syntax.
Importing XML Data Using the Load_File() Function
MySQLs XML capabilities were expanded as of
version 5.1.5 to include two new functions:
ExtractValue(): Extracts
a value from an XML string using XPath notation.
UpdateXML(): Returns
a replaced XML fragment.
The most common way to store XML in MySQL is
to use the LOAD_FILE()
function to open an entire XML document, store it in a variable, and insert the
variable into a table column. Here is the client_citizenship table again, with
a simplified structure. This time, it contains only two fields: an
auto-incrementing id and and an xml_data column to store the entire xml
document. The TEXT data type is ideal for this purpose because it can
accomodate extremely long strings:
CREATE TEMPORARY TABLE client_citizenship (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
xml_data TEXT NOT NULL
);
SET @xml = LOAD_FILE("c:\\client_citizenships.xml");
INSERT INTO client_citizenship VALUES (NULL, @xml);
From there we can use the ExtractValue()
function to retrieve variables from the xml_data field. ExtractValue() takes
two arguments. The first is the XML fragment to be tested; the second is the
XPath expression to be matched. Xpath is a query language
that was specifically created for selecting nodes from an XML document. You can
learn more about XPath on the www.w3.org
site. Although we wont be needing it, the XPath argument
may contain the colon character (:)
in element selectors, in order to support namespaces. Retrieveing data is a
two step process. First, we get the XML from the client_citizenship
table and place into a variable:
SELECT xml_data FROM client_citizenship LIMIT 1 INTO @xml;
Now we use ExtractValue() with @xml
as the first argument and an XPath string for the second:
SELECT ExtractValue(@xml, '//row[2]/field[1]/@name'),
ExtractValue(@xml, '//row[2]/field[1]');
The XPath expression in the first call to
ExtractValue() retrieves the value for the name attribute of the first <field> element of the second
<row>. If you recall, the name attribute contains the column headers.
The second Xpath expression fetches the text node, which is of course the field
value:
...
</row>
<row> (row[2])
<field name="client_id">2</field> (field[1])
<field name="date_of_birth">1944-01-15</field>
...
Here is the output produced by the above
query:
+----------------------------------------------+----------------------------------------+
|ExtractValue(@xml, "//row[2]/field[1]/@name") |ExtractValue(@xml, '//row[2]/field[1]') |
+----------------------------------------------+----------------------------------------+
|client_id |2 |
+----------------------------------------------+----------------------------------------+
In the next installment, well look at a nicer
way to format our resultset so that the proper headers appear in the column
headings, rather than the supplied expressions. Well also expand the above
example to retrieve all the records from the XML document, using a stored
procedure. Finally, well learn more about the UpdateXML() function.
»
See All Articles by Columnist Rob Gravelle