XML is, it seems, everywhere. SQL Server has ever-improving functionality that helps us peek into, shred, store, manipulate and otherwise utilize XML. This article covers XML variables, XML columns, typed vs. untyped XML, and the xml data type methods.
Introduction
XML is, it seems, everywhere.
Businesses use it to coordinate, collaborate, and exchange data with other
businesses. SQL Server has ever-improving functionality that helps us peek
into, shred, store, manipulate and otherwise utilize XML. This article covers
XML variables, XML columns, typed vs. untyped XML, and the xml data type
methods.
xml Data Type
The much anticipated (at least
by me) xml data type was introduced into SQL Server 2005. The new data type
allows variables and columns to be created to specifically hold xml data,
whether it be an entire XML document or just some XML content.
XML Variables
Prior to SQL Server 2005, your
options for storing XML data were limited to text or varchar columns or
variables. Let's look at a couple now and then scenarios.
Scenario:
You need to create a stored
procedure to process incoming XML from a partner.
Then (SQL 2000):
You would have created a
procedure with a text or varchar parameter and loaded the value of that
parameter into memory with sp_xml_preparedocument. You would then use SELECT
statements with OPENXML to get data out utilizing XPath 1.0 expressions.
Finally, you execute sp_xml_removedocument to get the XML out of memory.
Now (SQL 2005, 2008)
You now simply create a
procedure with an xml parameter. Use new xml data type query() and nodes()
methods to get the data out. (We'll dive into these shortly.)
Scenario:
You need to store off XML data
you've received from a third party so you can join it with your system's data
in a report.
Then (SQL 2000)
Prior to the xml data type,
large XML could be stored in a text column, and you could read it into a temp
table that had a text column in it. However, you could not then use OPENXML
with that text column of the temporary table to examine the XML.
Now (SQL 2005, 2008)
Today if you want to store the
large XML, you would store it in an XML column and simply read it back into an
xml data type variable for processing. You could then proceed to examine and
manipulate it utilizing the xml data type methods discussed below.
XML Columns
SQL Server tables can now
contain xml data type columns. These columns can contain XML content and XML
documents. You must specify this at create time. The default is to create the
column to hold XML content. The difference being that XML content can have multiple
top-level elements and the XML document has to be an XML 1.0 document. In both
cases, the XML will be checked to determine if it is well-formed.
Typed vs Untyped xml data types
There are two types of XML
data, untyped and typed. For untyped XML data, no schema is defined and no
validation is required. Conversely, schemas are defined and validation
information is a concern for typed XML data. Additionally, the schema must be
stored in SQL Server prior to adding the typed XML column to a table. (See MSDN for
information on creating schema collections.) Below is first an example of
creating an untyped XML column and an example of creating a typed XML column
follows.
CREATE TABLE myTable(
IDCol BIGINT IDENTITY,
MyXMLCol XML NULL)
CREATE TABLE myTable(
IDCol BIGINT IDENTITY,
MyXMLCol XML NULL (myDBSchema.mySchemaCollection))
xml data type methods
The xml data type comes with
four methods that assist with querying and manipulating the XML data stored in
the column or variable. Those methods are query(), value(), nodes(), and
modify().
query()
The query() method of the xml
data type uses the XQuery Language to search inside xml data types. You can
find out more about XQuery support in SQL Server 2008 at MSDN.
The following example shows
creating some sample XML in an XML variable and then querying into that data.
The first query gets the first exercise element. The second query gets the data
for the Reps attribute.
DECLARE @myDoc xml
SET @myDoc = '
< Location="Chest">
Reps = "15">Exercises in order.
Bench Press
Flies
'
-- query the first exercise in the xml
SELECT @myDoc.query('/Workout/BodyPart/Exercise[1]')
--query the attribute Reps
SELECT @myDoc.query('data(/Workout/BodyPart[1]/@Reps)')
First Result:
Bench Press
Second Result:
15
Exist()
If you needed to determine if
the XML data contains a certain node, you could use the exist() method of the
xml data type. In the example below, the exist() method is used in the where
clause to allow only those nodes containing an Exercise element to be queried.
DECLARE @myDoc xml
SET @myDoc = '
< Location="Chest">
Reps = "15">Exercises in order.
Bench Press
Flies
< Location="Triceps">
Reps = "12">Exercises in order.
'
-- query the first exercise in the xml
SELECT @myDoc.query('/Workout/BodyPart/*')
WHERE @myDoc.exist('/Workout/BodyPart/Exercise') = 1
Result:
Bench Press
Flies
Value()
The method value() uses XQuery
to examine the XML and return back a scalar value of the specified SQL type.
The SQL type cannot be image, text, ntext, xml, a CLR type, or sql_variant .
This example uses the value() method to return the value inside the first
exercise node. The result is returned as a VARCHAR(20).
DECLARE @myDoc xml
SET @myDoc = '
< Location="Chest">
Reps = "15">Exercises in order.
Bench Press
Flies
< Location="Triceps">
Reps = "12">Exercises in order.
'
-- query the first exercise in the xml
SELECT @myDoc.value('(/Workout/BodyPart/Exercise)[1]', 'VARCHAR(20)')
Result:
Bench Press
nodes()
nodes() is used to shred the
XML nodes of the XML data into relational columns. This is similar to what
OPENXML was used for in SQL Server 2000. Using the same sample XML, this query
uses nodes() to first create a rowset from the XQuery provided. In this case,
the rowset was for the first exercise node and the select merely queried for
the entire rowset of the exercise element.
DECLARE @myDoc xml
SET @myDoc = '
< Location="Chest">
Reps = "15">Exercises in order.
Bench Press
Flies
< Location="Triceps">
Reps = "12">Exercises in order.
'
-- query the first exercise in the xml
SELECT T.exercise.query('.')
FROM @myDoc.nodes('(/Workout/BodyPart/Exercise)[1]') as T(exercise)
Result:
Bench Press
modify()
To change or remove a portion
of the content in an XML variable or column, the modify() method can be used.
XML Data Manipulation Language (DML) is used to specify the action within the
modify() method. With XML DML you can insert nodes, change nodes, and delete
nodes. More information on XML DML can be found at MSDN.
Here is an example of modifying the sample XML to include an additional
exercise prior to the one specified in the XQuery statement.
DECLARE @myDoc xml
SET @myDoc = '
Exercises in order.
Flies
'
-- update text in the first exercise
SET @myDoc.modify('
insert Decline Press
before (/Workout/BodyPart/Exercise)[1]')
SELECT @myDoc
Result:
Exercises in order.
Decline Press
Flies
Conclusion
Over the past decade, Microsoft
has been evolving the capabilities of SQL Server with regards to XML
processing. With this latest version, the syntax used to parse through and
manipulate XML has been greatly simplified. SQL Server now has much more
powerful and sophisticated XML tools at your finger tips.
For More Information
Working with the XML Data Type of SQL Server
MSDN XML (Transact-SQL)
MSDN xml Data Type Methods
MSDN XML DML
»
See All Articles by Columnist
Deanna Dicken