Obtain attribute names from XML using VBA
I need to get the distinct attributes names from the XML using VBA.
this is my code.
sub test()
Dim XMLFile As Object
Dim XMLFileName As String
Set XMLFile = CreateObject("Microsoft.XMLDOM")
XMLFileName = "C:\Users\Input.xml"
XMLFile.async = False
XMLFile.Load (XMLFileName)
XMLFile.validateOnParse = False
Dim mainnode As Object
Dim node As Object
Set mainnode = XMLFile.SelectNodes("//Elements")
For Each node In mainnode
For Each child In node.ChildNodes
Debug.Print child.BaseName
Dim kiddo As Object
For Each kiddo In child.ChildNodes
Debug.Print kiddo.BaseName
Next kiddo
Next child
Next node
End sub
Here is the sample XML. I need to get the attribute name num
from the XML.
<Elements>
<Details>
<Name>ABC</Name>
<Address>123ABC</Address>
<College>
<collname>safasf</collname>
<collnumber/>
</College>
</Details>
<Dept num="123">
<Deptname>IT</Deptname>
<ID>A123</ID>
</Dept>
</Elements>
Expected Result:
Elements
Details
Name
Address
College
collname
collnumber
Dept
num
Deptname
ID
Actual Result from above code:
Elements
Details
Name
Address
College
collname
Dept
Deptname
ID
The "num" attribute and <collnumber/>
tag is not fetched by my code. Could someone let me know how to fetch the attribute names along with the tag names from XML using VBA
Displaying XML structures including attributes via recursive function calls
My example code demonstrates a way to
-
[1]
assign the entire XML structure to a 2-dim array using XMLDOM methods and -
[2]
optionally write it back to a sheet.
Amplifying hints:
I added these ► structured hints to offer more help than by displaying code only, as I remarked that many of these points lead to repeated questions by other users, too:
- Trying to list
XML
structures you lose good view with increasing hierarchy depth of your node elements (type constant 1NODE_ELEMENT
), so I urgently recommend the use of ► recursive calls as used in this example code. - Furthermore you might have not considered the special construction of node text (type constant 3
NODE_TEXT
) being the first child of a name giving parent element - c.f. sections A. and B. in main functionlistChildNodes
. Your loops through child nodes would not distinguish between the mentioned types. Just study the comments in the cited function for details. - I suppose your XML file starts with a needed processing instruction like e.g.
<?xml version="1.0" encoding="utf-8"?>
, so that it can be actually identified as XML file. - The calling procedure
DisplayXML()
uses late binding instead of early bound reference to MS XML similar to your post, but uses the recommended MSXML2 version 6.0. It calls the main function via itsDocumentElement
<Elements>
(BTW a single node element) and a second argument referring to a predefined 2-dim arrayv
. -
Versioning: If you would set your
XMLFILE
object to memory withSet XDoc = CreateObject("MSXML2.DOMDocument")
generally you are getting the older default Version (3.0), so in most cases it's preferrable to use explicitlySet XDoc = CreateObject("MSXML2.DOMDocument.6.0")
instead (including XPath automatically). - If you don't use the
Load
function to get aTrue
(file loaded successfully) orFalse
(load error) back, it is not necessary to set the file name into brackets (). - The XPath operator
//
in search strings would return any occurences at any level (c.f.XMLFile.SelectNodes("//Elements")
in your OP). - Consider also the use of XSLT, a special-purpose language designed to tranform XML files into all kind of end-use formats.
Calling procedure DisplayXML
Hint: It would sufficient to dimension the array 's row count only with an estimated number of items in the calling procedure (e.g. 1000), as the main function executes a ReDim
(including a double transposition) automatically if needed. Nevertheless I added the exact items count here from the start via XPath/XMLDOM expression XMLFile.SelectNodes("//*").Length
counting any item in the entire file.
Option Explicit ' declaration head of your code module
Sub DisplayXML()
Dim XMLFile As Object
Dim XMLFileName As String
'Set XMLFile = CreateObject("Microsoft.XMLDOM") ' former style not recommended
Set XMLFile = CreateObject("MSXML2.DOMDocument.6.0")
XMLFileName = "C:\Users\Input.xml" ' << change to your xml file name
XMLFile.Async = False
XMLFile.ValidateOnParse = False
Debug.Print XMLFile.XML
If XMLFile.Load(XMLFileName) Then
' [1] write xml info to array with exact or assumed items count
Dim v As Variant: ReDim v(1 To XMLFile.SelectNodes("//*").Length, 1 To 2)
listChildNodes XMLFile.DocumentElement, v ' call helper function
' [2] write results to sheet "Dump" ' change to your sheet name
With ThisWorkbook.Worksheets("Dump")
.Range("A:B") = "" ' clear result range
.Range("A1:B1") = Split("XML Tag,Node Value", ",") ' titles
.Range("A2").Resize(UBound(v), UBound(v, 2)) = v ' get 2-dim info array
End With
Else
MsgBox "Load Error " & XMLFileName
End If
Set XMLFile = Nothing
End Sub
Structured results displayed in sheet
Hint: If you don't want the level indentation or enumerated Level hierarchy, you can easily adapt the main function listChildNodes()
below.
+----+---------------------+-----------------+
| | A | B |
+----+---------------------+-----------------+
|1 | XML Tag | Node Value |
+----+---------------------+-----------------+
|2 | 0 Elements | |
+----+---------------------+-----------------+
|3 | 1 Details | |
+----+---------------------+-----------------+
|4 | 2 Name | ABC |
+----+---------------------+-----------------+
|5 | 2 Address | 123ABC |
+----+---------------------+-----------------+
|6 | 2 College | |
+----+---------------------+-----------------+
|7 | 3 collname | safasf |
+----+---------------------+-----------------+
|8 | 3 collnumber | |
+----+---------------------+-----------------+
|9 | 1 Dept[@num="123"]| |
+----+---------------------+-----------------+
|10 | 2 Deptname | IT |
+----+---------------------+-----------------+
|11 | 2 ID | A123 |
+----+---------------------+-----------------+
It is also possible to refer to a precise node element, e.g. via
listChildNodes XMLFile.DocumentElement.SelectSingleNode("Dept[@num=""123""]"),v, 1, 1 ' starting from item no 1 and Level no 1
This would list the indicated node set alone:
+----+---------------------+-----------------+
| | A | B |
+----+---------------------+-----------------+
|1 | XML Tag | Node Value |
+----+---------------------+-----------------+
|2 | 1 Dept[@num="123"]| |
+----+---------------------+-----------------+
|3 | 2 Deptname | IT |
+----+---------------------+-----------------+
|4 | 2 ID | A123 |
+----+---------------------+-----------------+
Recursive main function listChildNodes()
Looping through childnode collections this function calls itself (i.e. the current node object) repeatedly ("recursively") and assigns the entire XML structure to a given 2-dim array (2nd argument). Furthermore it allows indendation and indicates the hierarchy levels. Note that the array in this example has to be 1-based.
Edit 20/8 2018
includes an automatic increase of array size if the items counter i
exceeds the current array's upper boundary (UBound(v)
, i.e. in its first dimension = items count). Technical note: As such a ReDim
isn't possible in a minor (here 1st) dimension, an intermediate transposition changing 'rows' (dim 1) to 'columns' (dim 2) is necessary.
Function listChildNodes(oCurrNode As Object, _
ByRef v As Variant, _
Optional ByRef i As Long = 1, _
Optional iLvl As Integer = 0 _
) As Boolean
' Purpose: assign the complete node structure with contents to a 1-based 2-dim array
' Author: T.M.
' Note: Late binding XML doesn't allow the use of IXMLDOMNodeType enumeration constants
' (1 ... NODE_ELEMENT, 2 ... NODE_ATTRIBUTE, 3 ... NODE_TEXT etc.)
' Escape
If oCurrNode Is Nothing Then Exit Function
If i < 1 Then i = 1 ' one based items Counter
' Edit 20/8 2018 - Automatic increase of array size if needed
If i >= UBound(v) Then ' change array size if needed
Dim tmp As Variant
tmp = Application.Transpose(v) ' change rows to columns
ReDim Preserve tmp(1 To 2, 1 To UBound(v) + 1000) ' increase row numbers
v = Application.Transpose(tmp) ' transpose back
Erase tmp
End If
Const NAMEColumn& = 1, VALUEColumn& = 2 ' constants for column 1 and 2
' Declare variables
Dim oChildNode As Object ' late bound node object
Dim bDisplay As Boolean
' ---------------------------------------------------------------------
' A. It's nothing but a TextNode (i.e. a parent node's firstChild!)
' ---------------------------------------------------------------------
If (oCurrNode.NodeType = 3) Then ' 3 ... NODE_TEXT
' display pure text content (NODE_TEXT) of parent elements
v(i, VALUEColumn) = oCurrNode.Text ' nodeValue of text node
' return
listChildNodes = True
ElseIf oCurrNode.NodeType = 1 Then ' 1 ... NODE_ELEMENT
' --------------------------------------------------------------
' B.1 NODE_ELEMENT WITHOUT text node immediately below,
' a) e.g. <Details> followed by node element <NAME>,
' (i.e. FirstChild.NodeType must not be of type NODE_TEXT = 3)
' b) e.g. <College> node element without any child node
' Note: a text content (NODE_TEXT) actually is a child node(!) to an element node
' (see section A. getting the FirstChild of a NODE_ELEMENT)
' --------------------------------------------------------------
' a) display parent elements of other element nodes
If oCurrNode.HasChildNodes Then
If Not oCurrNode.FirstChild.NodeType = 3 Then ' <>3 ... not a NODE_TEXT
bDisplay = True
End If
' b) always display empty node elements
Else ' empty NODE_ELEMENT
bDisplay = True
End If
If bDisplay Then
v(i, NAMEColumn) = String(iLvl * 2, " ") & _
iLvl & " " & _
oCurrNode.nodename & getAtts(oCurrNode)
i = i + 1
End If
' --------------------------------------------------------------
' B.2 check child nodes
' --------------------------------------------------------------
For Each oChildNode In oCurrNode.ChildNodes
' ~~~~~~~~~~~~~~~~~
' recursive call <<
' ~~~~~~~~~~~~~~~~~
bDisplay = listChildNodes(oChildNode, v, i, iLvl + 1)
If bDisplay Then
v(i, NAMEColumn) = String(iLvl * 2, " ") & _
iLvl & " " & _
oCurrNode.nodename & getAtts(oCurrNode)
i = i + 1
End If
Next oChildNode
' return
listChildNodes = False
Else ' just to demonstrate the use of other xml types as e.g. <!-- comments -->
If oCurrNode.NodeType = 8 Then ' 8 ... NODE_COMMENT
v(i, VALUEColumn) = "<!-- " & oCurrNode.NodeValue & "-->"
i = i + 1
End If
' return
listChildNodes = False
End If
End Function
'Helper function getAtts()
This helper function called by the above function returns a string enumerating all attribute names and values of a given node similar to XPath
notation; the code can be easily adapted to your needs.
Function getAtts(ByRef node As Object) As String
' Purpose: return attribute(s) string in brackets, e.g. '[@num="123"]'
' Note: called by above function listChildNodes()
' Author: T.M.
Dim sAtts$, ii&
If node.Attributes.Length > 0 Then
ii = 0: sAtts = ""
For ii = 0 To node.Attributes.Length - 1
sAtts = sAtts & "[@" & node.Attributes.Item(ii).nodename & "=""" & node.Attributes.Item(ii).NodeValue & """]"
Next ii
End If
' return
getAtts = sAtts
End Function