XPath: How to select elements based on their value?
I am new to using XPath and this may be a basic question. Kindly bear with me and help me in resolving the issue. I have an XML file like this:
<RootNode>
<FirstChild>
<Element attribute1="abc" attribute2="xyz">Data</Element>
<FirstChild>
</RootNode>
I can validate the presence of an <Element>
tag with:
//Element[@attribute1="abc" and @attribute2="xyz"]
Now I also want to check the value of the tag for string "Data"
. For achieving this I was told to use:
//Element[@attribute1="abc" and @attribute2="xyz" and Data]
When I use the later expression I get the following error:
Assertion failure message: No Nodes Matched
//Element[@attribute1="abc" and @attribute2="xyz" and Data]
Kindly provide me with your advice whether the XPath expression I have used is valid. If not what will be the valid XPath expression?
Solution 1:
The condition below:
//Element[@attribute1="abc" and @attribute2="xyz" and Data]
checks for the existence of the element Data within Element and not for element value Data.
Instead you can use
//Element[@attribute1="abc" and @attribute2="xyz" and text()="Data"]
Solution 2:
//Element[@attribute1="abc" and @attribute2="xyz" and .="Data"]
The reason why I add this answer is that I want to explain the relationship of .
and text()
.
The first thing is when using []
, there are only two types of data:
-
[number]
to select a node from node-set -
[bool]
to filter a node-set from node-set
In this case, the value is evaluated to boolean by function boolean()
, and there is a rule:
Filters are always evaluated with respect to a context.
When you need to compare text()
or .
with a string "Data"
, it first uses string()
function to transform those to string type, than gets a boolean result.
There are two important rule about string()
:
-
The
string()
function converts a node-set to a string by returning the string value of the first node in the node-set, which in some instances may yield unexpected results.text()
is relative path that return a node-set contains all the text node of current node(context node), like["Data"]
. When it is evaluated bystring(["Data"])
, it will return the first node of node-set, so you get "Data" only when there is only one text node in the node-set. -
If you want the
string()
function to concatenate all child text, you must then pass a single node instead of a node-set.For example, we get a node-set
['a', 'b']
, you can pass there parent node tostring(parent)
, this will return'ab'
, and of causestring(.)
in you case will return an concatenated string"Data"
.
Both way will get same result only when there is a text node.