xpath expression to remove whitespace
I have this HTML:
<tr class="even expanded first>
<td class="score-time status">
<a href="/matches/2012/08/02/europe/uefa-cup/">
16 : 00
</a>
</td>
</tr>
I want to extract the (16 : 00) string without the extra whitespace. Is this possible?
Solution 1:
I. Use this single XPath expression:
translate(normalize-space(/tr/td/a), ' ', '')
Explanation:
normalize-space()
produces a new string from its argument, in which any leading or trailing white-space (space, tab, NL or CR characters) is deleted and any intermediary white-space is replaced by a single space character.translate()
takes the result produced bynormalize-space()
and produces a new string in which each of the remaining intermediary spaces is replaced by the empty string.
II. Alternatively:
translate(/tr/td/a, ' 	 
', '')
Solution 2:
Please try the below xpath expression :
//td[@class='score-time status']/a[normalize-space() = '16 : 00']
Solution 3:
You can use XPath's normalize-space() as in //a[normalize-space()="16 : 00"]
Solution 4:
I came across this thread when I was having my own issue similar to above.
HTML
<div class="d-flex">
<h4 class="flex-auto min-width-0 pr-2 pb-1 commit-title">
<a href="/nsomar/OAStackView/releases/tag/1.0.1">
1.0.1
</a>
XPath start command
tree.xpath('//div[@class="d-flex"]/h4/a/text()')
However this grabbed random whitespace and gave me the output of:
['\n ', '\n 1.0.1\n ']
Using normalize-space, it removed the first blank space node and left me with just what I wanted
tree.xpath('//div[@class="d-flex"]/h4/a/text()[normalize-space()]')
['\n 1.0.1\n ']
I could then grab the first element of the list, and use strip() to remove any further whitespace
XPath final command
tree.xpath('//div[@class="d-flex"]/h4/a/text()[normalize-space()]')[0].strip()
Which left me with exactly what I required:
1.0.1