Understanding EXPLICIT XML Formatting
 
In the article titled Understanding and Using SELECT ... FOR XML, we saw the various types of XML formatting provided by SQL Server. In all the examples that were shown, we saw that the XML either contained elements or attributes. But what if we want a combination of both?? That's where EXPLICIT steps in. The EXPLICIT option gives you complete control over how the XML document should look like (thus making it more complex to code for). When EXPLICIT mode is used, the query can be excessively big as more complex joins are made, but conceptually, it is quite easy to use, provided you work backwards.

One of the core concepts in the EXPLICIT mode is something called the "universal table". This table, which is created in memory by SQL Server contains one column for each table column that you want to appear in the XML. It also has 2 special columns called "tag" and "parent", which are used to determine the hierarchy of the XML. The columns that correspond to the various table columns have a special notation as shown below. This notation provides information to the SQL XML parser regarding the output that is required.


[Element Tag!Tag!Attribute!Directive]

This article explores the various usage of XML EXPLICIT and the meaning of the above syntax. Before we get into the nuances of this command, let's take an example:

Assume that you have the following SELECT ... FOR XML clause and its associated output (results formatted):


SELECT TOP 2 title_id, title, type
	FROM titles FOR XML AUTO, ELEMENTS

Output:
XML_F52E2B61-18A1-11d1-B105-00805F49916B
-----------------------------------------------------------------------------------
<titles>
	<title_id>BU1032</title_id>
	<title>The Busy Executive&apos;s Database Guide</title>
	<type>business    </type>
</titles>
<titles>
	<title_id>BU1111</title_id>
	<title>Cooking with Computers: Surreptitious Balance Sheets</title>
	<type>business    </type>
</titles>

In the above output, we see that each column that we requested for in the SELECT clause has been rendered as an element. Now, let's assume that we want to have the "title_id" as an attribute and the "title" and "type" as elements. How do we do it?? Here is the EXPLICIT query for the same:


SELECT TOP 2
	1 AS Tag, NULL AS Parent,
	title_id AS [titles!1!title_id],
	title AS [titles!1!title!element],
	type AS [titles!1!type!element]
FROM
	titles
FOR XML EXPLICIT

Output:
XML_F52E2B61-18A1-11d1-B105-00805F49916B
-----------------------------------------------------------------------------------
<titles title_id="BU1032">
	<title>The Busy Executive&apos;s Database Guide</title>
	<type>business    </type>
</titles>
<titles title_id="BU1111">
	<title>Cooking with Computers: Surreptitious Balance Sheets</title>
	<type>business    </type>
</titles>

Ok, we got the output that we want, now, let's dissect the query.

  • First, since we working with only table, we will require only one level of nodes. Therefore, we specify "tag" as 1 and NULL for "parent", since no replationship is being portrayed.
  • Identify all the columns that we will need to be in the XML (for our example, we need title_id, title and type).
  • Specify that we want "title_id" as an attribute, using the syntax: [titles!1!title_id]. Comparing this syntax with the generic one provided above, we see that the element tag is called "titles", the tag is "1" and the attribute name is "title_id". The directive portion is optional.
  • Specify that we want "title" as an element, using the syntax: [titles!1!title!element]. The only difference between this and (3) is that we have now specified a directive as "element". This means that we want the "title" field as an element inside the "titles" tag.
  • Specify that we want "type" as an element, using the syntax: [titles!1!type!element]. Refer to (4) for details on the syntax.
  • Finally specify FOR XML EXPLICIT to generate the XML that we want.

In the output above, notice that the XML is encoded (there is an &apos;) in the output. If we do not want this encoding to happen, we can specify a different directive in the SELECT statement as shown:


SELECT TOP 2
	1 AS Tag, NULL AS Parent,
	title_id AS [titles!1!title_id],
	title AS [titles!1!title!xml],
	type AS [titles!1!type!element]
FROM
	titles
FOR XML EXPLICIT

Output:
XML_F52E2B61-18A1-11d1-B105-00805F49916B
-----------------------------------------------------------------------------------
<titles title_id="BU1032">
	<title>The Busy Executive's Database Guide</title>
	<type>business    </type>
</titles>
<titles title_id="BU1111">
	<title>Cooking with Computers: Surreptitious Balance Sheets</title>
	<type>business    </type>
</titles>

Now, you can see that the encoding does not happen. Notice the usage of the xml directive.

EXPLICIT gets interesting when we want to simulate relationships. For example, assume that we want to display all the titles of an author in an XML that has the au_id, au_fname, au_lname as attributes of the "authors" element and each title enclosed in a "titles" element with the title_id and "title" as attributes. Here is the query that does this and the associated output (results abridged and formatted):


SELECT
	1 AS Tag, NULL AS Parent,
	a.au_id AS [authors!1!au_id],
	a.au_fname AS [authors!1!au_fname],
	a.au_lname AS [authors!1!au_fname],
	NULL AS [titles!2!title_id],
	NULL AS [titles!2!title]
FROM
	authors a
UNION
SELECT
	2 AS Tag, 1 AS Parent,
	authors.au_id,
	NULL,
	NULL,
	t.title_id,
	t.title
FROM
	titles t
		INNER JOIN titleauthor ON t.title_id = titleauthor.title_id
		INNER JOIN authors ON titleauthor.au_id = authors.au_id
ORDER BY [authors!1!au_id], [titles!2!title_id]
FOR XML EXPLICIT

Output:
XML_F52E2B61-18A1-11d1-B105-00805F49916B
----------------------------------------------------------------------------------------
<authors au_id="172-32-1176" au_fname="Johnson" au_fname="White">
	<titles title_id="PS3333" title="Prolonged Data Deprivation: Four Case Studies"/>
</authors>
<authors au_id="213-46-8915" au_fname="Marjorie" au_fname="Green">
	<titles title_id="BU1032" title="The Busy Executive&apos;s Database Guide"/>
	<titles title_id="BU2075" title="You Can Combat Computer Stress!"/>
</authors>

This query does lots of interesting things.

  • Notice how we establish the relationship between authors and titles using the "tag" and "parent" columns.
  • The first portion of the SELECT decides the various columns that will be needed along with their layout. Data is fetched only from the master table. Notice the NULL for the child tables. The NULL indicates place holder columns which will be filled by the second query.
  • The second SELECT gets the data for the child table by making appropriate joins. Notice how we specify the joining column in the SELECT (au_id) and leave the other columns as NULL.
  • Notice the ORDER BY clause that orders the result by the au_id and then by title_id within each author.
  • Finally we specify the XML EXPLICIT option.

In the example above, suppose that we want to sort by title_id, but not include it in the XML, we use a different directive as shown below:


SELECT
	1 AS Tag, NULL AS Parent,
	a.au_id AS [authors!1!au_id],
	a.au_fname AS [authors!1!au_fname],
	a.au_lname AS [authors!1!au_fname],
	NULL AS [titles!2!title_id!hide],
	NULL AS [titles!2!title]
FROM
	authors a
UNION
SELECT
	2 AS Tag, 1 AS Parent,
	authors.au_id,
	NULL,
	NULL,
	t.title_id,
	t.title
FROM
	titles t
		INNER JOIN titleauthor ON t.title_id = titleauthor.title_id
		INNER JOIN authors ON titleauthor.au_id = authors.au_id
ORDER BY [authors!1!au_id], [titles!2!title_id!hide]
FOR XML EXPLICIT

Output:
XML_F52E2B61-18A1-11d1-B105-00805F49916B
----------------------------------------------------------------------------------------
<authors au_id="172-32-1176" au_fname="Johnson" au_fname="White">
	<titles title="Prolonged Data Deprivation: Four Case Studies"/>
</authors>
<authors au_id="213-46-8915" au_fname="Marjorie" au_fname="Green">
	<titles title="The Busy Executive&apos;s Database Guide"/>
	<titles title="You Can Combat Computer Stress!"/>
</authors>

Notice that we have used a directive called "hide". This directive is used to hide elements from the output, but include them for other purposes (like sorting).

Another set of directives that are of interest is id, idref and idrefs. These directives can be used to represent relational data in an XML document. This approach works only when there is a schema definition for the data. Here is a simple query:


SELECT TOP 2
	1 AS Tag, NULL AS Parent,
	a.au_id AS [authors!1!au_id!id],
	a.au_fname AS [authors!1!au_fname],
	NULL AS [titles!2!title_id],
	NULL AS [titles!2!au_id!idref]
FROM
	authors a
UNION
SELECT
	2 AS Tag, NULL AS Parent,
	NULL,
	NULL,
	t.title_id,
	t.au_id
FROM
	titleauthor t
FOR XML EXPLICIT, XMLDATA

The usage of the XMLDATA option generates a schema for the data. Note the usage of the id and idref attributes.

Well, there are a couple of directives that I have not covered like: cdata, xmltext. These are quite simple to use and understand and I leave it to the readers to explore them in more detail. You can refer to the SQL Server BOL for more details.

The XML EXPLICIT option is probably the slowest of all XML options, since it has to work thorugh a lot of options. So use it with care :-)

Related article:

Understanding and Using SELECT ... FOR XML

Home