Understanding and Using SELECT ... FOR XML
 
The SELECT statement is the most fundamental data access query in SQL Server. It allows you to retrieve and aggregate information from tables in a database and return the results as a rowset. In SQL Server 2000, the FOR XML clause was added as an extension to SELECT, which transforms the rowset output to an XML stream. This is a very powerful extension to the SELECT command. The XML stream thus formed can have the following formants: RAW, AUTO or EXPLICIT. The new syntax of the SELECT statement is as follows:

SELECT
	column list
FROM
	table list
WHERE
	filter criteria
FOR XML RAW | AUTO | EXPLICIT [, XMLDATA] [, ELEMENTS] [, BINARY BASE64]

In this article we will see examples of these variations except EXPLICIT. I would like to take this option as a seperate article because of its complexity and options.

The XML RAW is the most simplest of the forms and probably the most fastest of all the options. This format provides the output of each row within a generic node called "row". All the columns of data are transformed as attributes. Here is a sample command and the results (the result is abridged and formatted).


SELECT title_id, title FROM titles FOR XML RAW

Output:
XML_F52E2B61-18A1-11d1-B105-00805F49916B
---------------------------------------------------------------------------------------------------
<row title_id="PC1035" title="But Is It User Friendly?"/>
<row title_id="PS1372" title="Computer Phobic AND Non-Phobic Individuals: Behavior Variations"/>
<row title_id="BU1111" title="Cooking with Computers: Surreptitious Balance Sheets"/>
<row title_id="PS7777" title="Emotional Security: A New Algorithm"/>
<row title_id="TC4203" title="Fifty Years in Buckingham Palace Kitchens"/>
<row title_id="PS2091" title="Is Anger the Enemy?"/>

One of the most common questions that I see people asking in the newsgroups is whether the strange number that appears as the column header in the output can be modified. The answer is no. SQL Server generates the XML output as a stream and the name provided is random name. You will never have to worry about this name.

The XML AUTO method is the next possible form and it provides more control over the XML fragment than the RAW mode. Before we get into the details, let's run this query to see the difference.


SELECT title_id, title FROM titles FOR XML AUTO

Output:
XML_F52E2B61-18A1-11d1-B105-00805F49916B
-----------------------------------------------------------------------------------------------------
<titles title_id="PC1035" title="But Is It User Friendly?"/>
<titles title_id="PS1372" title="Computer Phobic AND Non-Phobic Individuals: Behavior Variations"/>
<titles title_id="BU1111" title="Cooking with Computers: Surreptitious Balance Sheets"/>
<titles title_id="PS7777" title="Emotional Security: A New Algorithm"/>
<titles title_id="TC4203" title="Fifty Years in Buckingham Palace Kitchens"/>
<titles title_id="PS2091" title="Is Anger the Enemy?"/>

Notice that each row is named after the table from where the row came. Note that all columns are still rendered as attributes. The main difference between RAW and AUTO lies in the fact in how they handle joins. Let's see an example of this difference (results formatted):


-- Query using joins and XML RAW
SELECT
	TOP 1 authors.au_id, titles.title_id
FROM
	authors
		INNER JOIN titleauthor ON authors.au_id = titleauthor.au_id
		INNER JOIN titles ON titles.title_id = titleauthor.title_id
FOR XML RAW

-- Query using joins and XML AUTO
SELECT
	TOP 1 authors.au_id, titles.title_id
FROM
	authors
		INNER JOIN titleauthor ON authors.au_id = titleauthor.au_id
		INNER JOIN titles ON titles.title_id = titleauthor.title_id
FOR XML AUTO

Output:
XML_F52E2B61-18A1-11d1-B105-00805F49916B
-----------------------------------------------------------
<row au_id="409-56-7008" title_id="BU1032"/>

XML_F52E2B61-18A1-11d1-B105-00805F49916B
-----------------------------------------------------------
<authors au_id="409-56-7008">
	<titles title_id="BU1032"/>
</authors>

The difference is immediately apparent. The RAW format does not show anything in the output that indicates that a join has happened. All the columns are rendered as attributes. In the AUTO format however, notice how the output is nested within the "authors" element, which provides a clean understanding of the relatioship between the tables involved in the join.

In all the outputs thus far, we have seen the columns of the table being rendered as attributes. The ELEMENTS option renders each column as an element as shown in the following example (results formatted):


SELECT
	TOP 1 authors.au_id, titles.title_id
FROM
	authors
		INNER JOIN titleauthor ON authors.au_id = titleauthor.au_id
		INNER JOIN titles ON titles.title_id = titleauthor.title_id
FOR XML AUTO, ELEMENTS

Output:
XML_F52E2B61-18A1-11d1-B105-00805F49916B
-------------------------------------------------------
<authors>
	<au_id>409-56-7008</au_id>
	<titles>
		<title_id>BU1032</title_id>
	</titles>
</authors>

Well, this was a lengthy article :-) but it contained some of the core concepts about FOR XML and how the output varies depending on the options used.

Home