XPath: How to select elements based on their value?

I am new to using XPath and this may be a basic question. Kindly bear with me and help me in resolving the issue. I have an XML file like this:

<RootNode>
  <FirstChild>
    <Element attribute1="abc" attribute2="xyz">Data</Element>
  <FirstChild>
</RootNode>

I can validate the presence of an <Element> tag with:

//Element[@attribute1="abc" and @attribute2="xyz"]

Now I also want to check the value of the tag for string "Data". For achieving this I was told to use:

//Element[@attribute1="abc" and @attribute2="xyz" and Data]

When I use the later expression I get the following error:

Assertion failure message: No Nodes Matched //Element[@attribute1="abc" and @attribute2="xyz" and Data]

Kindly provide me with your advice whether the XPath expression I have used is valid. If not what will be the valid XPath expression?


Solution 1:

The condition below:

//Element[@attribute1="abc" and @attribute2="xyz" and Data]

checks for the existence of the element Data within Element and not for element value Data.

Instead you can use

//Element[@attribute1="abc" and @attribute2="xyz" and text()="Data"]

Solution 2:

//Element[@attribute1="abc" and @attribute2="xyz" and .="Data"]

The reason why I add this answer is that I want to explain the relationship of . and text() .

The first thing is when using [], there are only two types of data:

  1. [number] to select a node from node-set
  2. [bool] to filter a node-set from node-set

In this case, the value is evaluated to boolean by function boolean(), and there is a rule:

Filters are always evaluated with respect to a context.

When you need to compare text() or . with a string "Data", it first uses string() function to transform those to string type, than gets a boolean result.

There are two important rule about string():

  1. The string() function converts a node-set to a string by returning the string value of the first node in the node-set, which in some instances may yield unexpected results.

    text() is relative path that return a node-set contains all the text node of current node(context node), like ["Data"]. When it is evaluated by string(["Data"]), it will return the first node of node-set, so you get "Data" only when there is only one text node in the node-set.

  2. If you want the string() function to concatenate all child text, you must then pass a single node instead of a node-set.

    For example, we get a node-set ['a', 'b'], you can pass there parent node to string(parent), this will return 'ab', and of cause string(.) in you case will return an concatenated string "Data".

Both way will get same result only when there is a text node.