How do I remove redundant namespace in nested query when using FOR XML PATH

UPDATE: I've discovered there is a Microsoft Connect item raised for this issue here

When using FOR XML PATH and WITH XMLNAMESPACES to declare a default namespace, I will get the namespace declaration duplicated in any top level nodes for nested queries that use FOR XML, I've stumbled across a few solutions on-line, but I'm not totally convinced...

Here's an Complete Example

/*
drop table t1
drop table t2
*/
create table t1 ( c1 int, c2 varchar(50))
create table t2 ( c1 int, c2 int, c3 varchar(50))
insert t1 values 
(1, 'Mouse'),
(2, 'Chicken'),
(3, 'Snake');
insert t2 values
(1, 1, 'Front Right'),
(2, 1, 'Front Left'),
(3, 1, 'Back Right'),
(4, 1, 'Back Left'),
(5, 2, 'Right'),
(6, 2, 'Left')



;with XmlNamespaces( default 'uri:animal')
select 
    a.c2 as "@species"
    , (select l.c3 as "text()" 
       from t2 l where l.c2 = a.c1 
       for xml path('leg'), type) as "legs"
from t1 a
for xml path('animal'), root('zoo')

What's the best solution?


Solution 1:

After hours of desperation and hundreds of trials & errors, I've come up with the solution below.

I had the same issue, when I wanted just one xmlns attribute, on the root node only. But I also had a very difficult query with lot's of subqueries and FOR XML EXPLICIT method alone was just too cumbersome. So yes, I wanted the convenience of FOR XML PATH in the subqueries and also to set my own xmlns.

I kindly borrowed the code of 8kb's answer, because it was so nice. I tweaked it a bit for better understanding. Here is the code:

DECLARE @Order TABLE (OrderID INT, OrderDate DATETIME)    
DECLARE @OrderDetail TABLE (OrderID INT, ItemID VARCHAR(1), Name VARCHAR(50), Qty INT)    
INSERT @Order VALUES (1, '2010-01-01'), (2, '2010-01-02')    
INSERT @OrderDetail VALUES (1, 'A', 'Drink',  5),
                           (1, 'B', 'Cup',    2),
                           (2, 'A', 'Drink',  2),
                           (2, 'C', 'Straw',  1),
                           (2, 'D', 'Napkin', 1)

-- Your ordinary FOR XML PATH query
DECLARE @xml XML = (SELECT OrderID AS "@OrderID",
                        (SELECT ItemID AS "@ItemID", 
                                Name AS "data()" 
                         FROM @OrderDetail 
                         WHERE OrderID = o.OrderID 
                         FOR XML PATH ('Item'), TYPE)
                    FROM @Order o 
                    FOR XML PATH ('Order'), ROOT('dummyTag'), TYPE)

-- Magic happens here!       
SELECT 1 AS Tag
      ,NULL AS Parent
      ,@xml AS [xml!1!!xmltext]
      ,'http://test.com/order' AS [xml!1!xmlns]
FOR XML EXPLICIT

Result:

<xml xmlns="http://test.com/order">
  <Order OrderID="1">
    <Item ItemID="A">Drink</Item>
    <Item ItemID="B">Cup</Item>
  </Order>
  <Order OrderID="2">
    <Item ItemID="A">Drink</Item>
    <Item ItemID="C">Straw</Item>
    <Item ItemID="D">Napkin</Item>
  </Order>
</xml>

If you selected @xml alone, you would see that it contains root node dummyTag. We don't need it, so we remove it by using directive xmltext in FOR XML EXPLICIT query:

,@xml AS [xml!1!!xmltext]

Although the explanation in MSDN sounds more sophisticated, but practically it tells the parser to select the contents of XML root node.

Not sure how fast the query is, yet currently I am relaxing and drinking Scotch like a gent while peacefully looking at the code...

Solution 2:

If I have understood correctly, you are referring to the behavior that you might see in a query like this:

DECLARE @Order TABLE (
  OrderID INT, 
  OrderDate DATETIME)

DECLARE @OrderDetail TABLE (
  OrderID INT, 
  ItemID VARCHAR(1), 
  ItemName VARCHAR(50), 
  Qty INT)

INSERT @Order 
VALUES 
(1, '2010-01-01'),
(2, '2010-01-02')

INSERT @OrderDetail 
VALUES 
(1, 'A', 'Drink',  5),
(1, 'B', 'Cup',    2),
(2, 'A', 'Drink',  2),
(2, 'C', 'Straw',  1),
(2, 'D', 'Napkin', 1)

;WITH XMLNAMESPACES('http://test.com/order' AS od) 
SELECT
  OrderID AS "@OrderID",
  (SELECT 
     ItemID AS "@od:ItemID", 
     ItemName AS "data()" 
   FROM @OrderDetail 
   WHERE OrderID = o.OrderID 
   FOR XML PATH ('od.Item'), TYPE)
FROM @Order o 
FOR XML PATH ('od.Order'), TYPE, ROOT('xml')

Which gives the following results:

<xml xmlns:od="http://test.com/order">
  <od.Order OrderID="1">
    <od.Item xmlns:od="http://test.com/order" od:ItemID="A">Drink</od.Item>
    <od.Item xmlns:od="http://test.com/order" od:ItemID="B">Cup</od.Item>
  </od.Order>
  <od.Order OrderID="2">
    <od.Item xmlns:od="http://test.com/order" od:ItemID="A">Drink</od.Item>
    <od.Item xmlns:od="http://test.com/order" od:ItemID="C">Straw</od.Item>
    <od.Item xmlns:od="http://test.com/order" od:ItemID="D">Napkin</od.Item>
  </od.Order>
</xml>

As you said, the namespace is repeated in the results of the subqueries.

This behavior is a feature according to a conversation on devnetnewsgroup (website now defunct) although there is the option to vote on changing it.

My proposed solution is to revert back to FOR XML EXPLICIT:

SELECT
  1 AS Tag,
  NULL AS Parent,
  'http://test.com/order' AS [xml!1!xmlns:od],
  NULL AS [od:Order!2],
  NULL AS [od:Order!2!OrderID],
  NULL AS [od:Item!3],
  NULL AS [od:Item!3!ItemID]
UNION ALL
SELECT 
  2 AS Tag,
  1 AS Parent,
  'http://test.com/order' AS [xml!1!xmlns:od],
  NULL AS [od:Order!2],
  OrderID AS [od:Order!2!OrderID],
  NULL AS [od:Item!3],
  NULL [od:Item!3!ItemID]
FROM @Order 
UNION ALL
SELECT
  3 AS Tag,
  2 AS Parent,
  'http://test.com/order' AS [xml!1!xmlns:od],
  NULL AS [od:Order!2],
  o.OrderID AS [od:Order!2!OrderID],
  d.ItemName AS [od:Item!3],
  d.ItemID AS [od:Item!3!ItemID]
FROM @Order o INNER JOIN @OrderDetail d ON o.OrderID = d.OrderID
ORDER BY [od:Order!2!OrderID], [od:Item!3!ItemID]
FOR XML EXPLICIT

And see these results:

<xml xmlns:od="http://test.com/order">
  <od:Order OrderID="1">
    <od:Item ItemID="A">Drink</od:Item>
    <od:Item ItemID="B">Cup</od:Item>
  </od:Order>
  <od:Order OrderID="2">
    <od:Item ItemID="A">Drink</od:Item>
    <od:Item ItemID="C">Straw</od:Item>
    <od:Item ItemID="D">Napkin</od:Item>
  </od:Order>
</xml>

Solution 3:

An alternative solution I've seen is to add the XMLNAMESPACES declaration after building the xml into a temporary variable:

declare @xml as xml;
select @xml = (
select 
    a.c2 as "@species"
    , (select l.c3 as "text()" 
       from t2 l where l.c2 = a.c1 
       for xml path('leg'), type) as "legs"
from t1 a
for xml path('animal'))

;with XmlNamespaces( 'uri:animal' as an)
select @xml for xml path('') , root('zoo');