Working with XML Data in MySQLNovember 4, 2009 XML was created to facilitate the transportation of data between systems and businesses in a language nonspecific way. When first introduced, a script or application process had to parse the XML data and convert it into valid format for the database and underlying system. Later, databases started to support XML so that no conversion was necessary. Today, well look at XML support in MySQL and learn how to import and export data in XML format. Exporting Data as XML Using the --xml OptionPrior to version 5.0, MySQL XML support was limited to exporting data in XML format using the MySQL command line client. Executing a command or query using the --xml or -X option told the MySQL client utility to produce the output as XML data. For instance, the following line would output all of the database variables that begin with the string version: C:\>mysql -u <userid> -p<password> -e "SHOW VARIABLES LIKE '%version%'" -xml The resulting XML contains the following standard format:
<?xml version="1.0"?>
<resultset statement="SHOW VARIABLES LIKE '%version%'"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<row>
<field name="Variable_name">protocol_version</field>
<field name="Value">10</field>
</row>
<row>
<field name="Variable_name">version</field>
<field name="Value">5.1.30-community</field>
</row>
<row>
<field name="Variable_name">version_comment</field>
<field name="Value">MySQL Community Server (GPL)</field>
</row>
<row>
<field name="Variable_name">version_compile_machine</field>
<field name="Value">ia32</field>
</row>
<row>
<field name="Variable_name">version_compile_os</field>
<field name="Value">Win32</field>
</row>
</resultset>
For comparison, here is the above output again as a tabular resultset: +-------------------------+---------------------+ | Variable_name | Value | +-------------------------+---------------------+ | protocol_version | 10 | | version | 5.1.22-beta-debug | | version_comment | Source distribution | | version_compile_machine | x86_64 | | version_compile_os | suse-linux-gnu | +-------------------------+---------------------+ The same syntax can be applied to any SELECT statement. In my last article, entitled Make Your MySQL Queries More Efficient with Temporary Tables, the initial query populated the client_citizenship temporary table with records whose logical_delete_indicator was zero: INSERT INTO client_citizenship SELECT cl.client_id, cl.date_of_birth, cl.gender, cit.citizenship_id, cit.country_code, cit.primary_citizenship FROM temp_table_article.client AS cl, temp_table_article.citizenship AS cit, temp_table_article.client_citizenship_rel AS rel WHERE cl.client_id = rel.client_id AND cit.citizenship_id = rel.citizenship_id AND cit.logical_delete_indicator = 0 AND cl.logical_delete_indicator = 0; Performing a SELECT * on the temp table returned the following records: client_id, date_of_birth, gender, citizenship_id, country_code, primary_citizenship 2, 1944-01-15, F, 4, 20, 0 2, 1944-01-15, F, 7, 77, 1 As long as we logged in using the --xml flag, a SELECT * query will return the following XML-formatted results:
<?xml version="1.0"?>
<resultset statement="SELECT * FROM client_citizenship"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<row>
<field name="client_id">2</field>
<field name="date_of_birth">1944-01-15</field>
<field name="gender">F</field>
<field name="citizenship_id">4</field>
<field name="country_code">20</field>
<field name="primary_citizenship">0</field>
</row>
<row>
<field name="client_id">2</field>
<field name="date_of_birth">1944-01-15</field>
<field name="gender">F</field>
<field name="citizenship_id">7</field>
<field name="country_code">77</field>
<field name="primary_citizenship">1</field>
</row>
</resultset>
One disadvantage of the --xml flag is that the mapping from relational data to XML is fixed. Hence, there is no way to modify the output. If youre willing to do some string manipulation, you can dispense with the --xml flag entirely and produce your own XML code. Continuing with the previous query, lets see how we could alter the output XML so that it conforms to the following format:
<client>
<client_id>2</client_id>
<date_of_birth>1944-01-15</date_of_birth>
<gender>F</gender>
<citizenship_id>7</citizenship_id>
<country_code>77</country_code>
<primary_citizenship>1</primary_citizenship>
</client>
One way to do it would be to use the
mysql>SELECT CONCAT('\n<client>\n',
->GROUP_CONCAT('<client_id>', client_id, '</client_id>\n' SEPARATOR ''),
->GROUP_CONCAT('<date_of_birth>',date_of_birth,'</date_of_birth>\n' SEPARATOR ''),
->GROUP_CONCAT('<gender>',gender,'</gender>\n' SEPARATOR ''),
->GROUP_CONCAT('<citizenship_id>',citizenship_id,'</citizenship_id>\n' SEPARATOR ''),
->GROUP_CONCAT('<country_cd>',country_cd,'</country_cd>\n' SEPARATOR ''),
->GROUP_CONCAT('<primary_citizenship>',primary_citizenship,'</primary_citizenship>\n' SEPARATOR ''),
->'</client>') AS xmldoc
->FROM client_citizenship\G
Although we could get the job done that way, its not necesarily easier than using a scripting or programming language to perform the formatting, depending of course on your familiarity with programming syntax. Importing XML Data Using the Load_File() FunctionMySQLs XML capabilities were expanded as of version 5.1.5 to include two new functions: ExtractValue(): Extracts a value from an XML string using XPath notation. UpdateXML(): Returns a replaced XML fragment. The most common way to store XML in MySQL is
to use the
CREATE TEMPORARY TABLE client_citizenship (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
xml_data TEXT NOT NULL
);
SET @xml = LOAD_FILE("c:\\client_citizenships.xml");
INSERT INTO client_citizenship VALUES (NULL, @xml);
From there we can use the ExtractValue() function to retrieve variables from the xml_data field. ExtractValue() takes two arguments. The first is the XML fragment to be tested; the second is the XPath expression to be matched. Xpath is a query language that was specifically created for selecting nodes from an XML document. You can learn more about XPath on the www.w3.org site. Although we wont be needing it, the XPath argument may contain the colon character (:) in element selectors, in order to support namespaces. Retrieveing data is a two step process. First, we get the XML from the client_citizenship table and place into a variable: SELECT xml_data FROM client_citizenship LIMIT 1 INTO @xml; Now we use SELECT ExtractValue(@xml, '//row[2]/field[1]/@name'), ExtractValue(@xml, '//row[2]/field[1]'); The XPath expression in the first call to
ExtractValue() retrieves the value for the name attribute of the first
...
</row>
<row> (row[2])
<field name="client_id">2</field> (field[1])
<field name="date_of_birth">1944-01-15</field>
...
Here is the output produced by the above query: +----------------------------------------------+----------------------------------------+ |ExtractValue(@xml, "//row[2]/field[1]/@name") |ExtractValue(@xml, '//row[2]/field[1]') | +----------------------------------------------+----------------------------------------+ |client_id |2 | +----------------------------------------------+----------------------------------------+ In the next installment, well look at a nicer way to format our resultset so that the proper headers appear in the column headings, rather than the supplied expressions. Well also expand the above example to retrieve all the records from the XML document, using a stored procedure. Finally, well learn more about the UpdateXML() function. |