Get XML only immediate children elements by name
My question is: How can I get elements directly under a specific parent element when there are other elements with the same name as a "grandchild" of the parent element.
I'm using the Java DOM library to parse XML Elements and I'm running into trouble. Here's some (a small portion) of the xml I'm using:
<notifications>
<notification>
<groups>
<group name="zip-group.zip" zip="true">
<file location="C:\valid\directory\" />
<file location="C:\another\valid\file.doc" />
<file location="C:\valid\file\here.txt" />
</group>
</groups>
<file location="C:\valid\file.txt" />
<file location="C:\valid\file.xml" />
<file location="C:\valid\file.doc" />
</notification>
</notifications>
As you can see, there are two places you can place the <file>
element. Either in groups or outside groups. I really want it structured this way because it's more user-friendly.
Now, whenever I call notificationElement.getElementsByTagName("file");
it gives me all the <file>
elements, including those under the <group>
element. I handle each of these kinds of files differently, so this functionality is not desirable.
I've thought of two solutions:
- Get the parent element of the file element and deal with it accordingly (depending on whether it's
<notification>
or<group>
. - Rename the second
<file>
element to avoid confusion.
Neither of those solutions are as desirable as just leaving things the way they are and getting only the <file>
elements which are direct children of <notification>
elements.
I'm open to IMPO comments and answers about the "best" way to do this, but I'm really interested in DOM solutions because that's what the rest of this project is using. Thanks.
I realise you found something of a solution to this in May @kentcdodds but I just had a fairly similar problem which I've now found, I think (perhaps in my usecase, but not in yours), a solution to.
a very simplistic example of my XML format is shown below:-
<?xml version="1.0" encoding="utf-8"?>
<rels>
<relationship num="1">
<relationship num="2">
<relationship num="2.1"/>
<relationship num="2.2"/>
</relationship>
</relationship>
<relationship num="1.1"/>
<relationship num="1.2"/>
</rels>
As you can hopefully see from this snippet, the format I want can have N-levels of nesting for [relationship] nodes, so obviously the problem I had with Node.getChildNodes() was that I was getting all nodes from all levels of the hierarchy, and without any sort of hint as to Node depth.
Looking at the API for a while , I noticed there are actually two other methods that might be of some use:-
- Node.getFirstChild()
- Node.getNextSibling()
Together, these two methods seemed to offer everything that was required to get all of the immediate descendant elements of a Node. The following jsp code should give a fairly basic idea of how to implement this. Sorry for the JSP. I'm rolling this into a bean now but didn't have time to create a fully working version from picked apart code.
<%@page import="javax.xml.parsers.DocumentBuilderFactory,
javax.xml.parsers.DocumentBuilder,
org.w3c.dom.Document,
org.w3c.dom.NodeList,
org.w3c.dom.Node,
org.w3c.dom.Element,
java.io.File" %><%
try {
File fXmlFile = new File(application.getRealPath("/") + "/utils/forms-testbench/dom-test/test.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
Element docEl = doc.getDocumentElement();
Node childNode = docEl.getFirstChild();
while( childNode.getNextSibling()!=null ){
childNode = childNode.getNextSibling();
if (childNode.getNodeType() == Node.ELEMENT_NODE) {
Element childElement = (Element) childNode;
out.println("NODE num:-" + childElement.getAttribute("num") + "<br/>\n" );
}
}
} catch (Exception e) {
out.println("ERROR:- " + e.toString() + "<br/>\n");
}
%>
This code would give the following output, showing only direct child elements of the initial root node.
NODE num:-1
NODE num:-1.1
NODE num:-1.2
Hope this helps someone anyway. Cheers for the initial post.
You can use XPath for this, using two path to get them and process them differently.
To get the <file>
nodes direct children of <notification>
use //notification/file
and for the ones in <group>
use //groups/group/file
.
This is a simple sample:
public class SO10689900 {
public static void main(String[] args) throws Exception {
DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = db.parse(new InputSource(new StringReader("<notifications>\n" +
" <notification>\n" +
" <groups>\n" +
" <group name=\"zip-group.zip\" zip=\"true\">\n" +
" <file location=\"C:\\valid\\directory\\\" />\n" +
" <file location=\"C:\\this\\file\\doesn't\\exist.grr\" />\n" +
" <file location=\"C:\\valid\\file\\here.txt\" />\n" +
" </group>\n" +
" </groups>\n" +
" <file location=\"C:\\valid\\file.txt\" />\n" +
" <file location=\"C:\\valid\\file.xml\" />\n" +
" <file location=\"C:\\valid\\file.doc\" />\n" +
" </notification>\n" +
"</notifications>")));
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr1 = xpath.compile("//notification/file");
NodeList nodes = (NodeList)expr1.evaluate(doc, XPathConstants.NODESET);
System.out.println("Files in //notification");
printFiles(nodes);
XPathExpression expr2 = xpath.compile("//groups/group/file");
NodeList nodes2 = (NodeList)expr2.evaluate(doc, XPathConstants.NODESET);
System.out.println("Files in //groups/group");
printFiles(nodes2);
}
public static void printFiles(NodeList nodes) {
for (int i = 0; i < nodes.getLength(); ++i) {
Node file = nodes.item(i);
System.out.println(file.getAttributes().getNamedItem("location"));
}
}
}
It should output:
Files in //notification
location="C:\valid\file.txt"
location="C:\valid\file.xml"
location="C:\valid\file.doc"
Files in //groups/group
location="C:\valid\directory\"
location="C:\this\file\doesn't\exist.grr"
location="C:\valid\file\here.txt"
Well, the DOM solution to this question is actually pretty simple, even if it's not too elegant.
When I iterate through the filesNodeList
, which is returned when I call notificationElement.getElementsByTagName("file")
, I just check whether the parent node's name is "notification". If it isn't then I ignore it because that will be handled by the <group>
element. Here's my code solution:
for (int j = 0; j < filesNodeList.getLength(); j++) {
Element fileElement = (Element) filesNodeList.item(j);
if (!fileElement.getParentNode().getNodeName().equals("notification")) {
continue;
}
...
}