xpath expression to remove whitespace

I have this HTML:

 <tr class="even  expanded first>
   <td class="score-time status">
     <a href="/matches/2012/08/02/europe/uefa-cup/">

            16 : 00

     </a>
    </td>        
  </tr>

I want to extract the (16 : 00) string without the extra whitespace. Is this possible?


Solution 1:

I. Use this single XPath expression:

translate(normalize-space(/tr/td/a), ' ', '')

Explanation:

  1. normalize-space() produces a new string from its argument, in which any leading or trailing white-space (space, tab, NL or CR characters) is deleted and any intermediary white-space is replaced by a single space character.

  2. translate() takes the result produced by normalize-space() and produces a new string in which each of the remaining intermediary spaces is replaced by the empty string.


II. Alternatively:

translate(/tr/td/a, ' &#9;&#10;&#13', '')

Solution 2:

Please try the below xpath expression :

//td[@class='score-time status']/a[normalize-space() = '16 : 00']

Solution 3:

You can use XPath's normalize-space() as in //a[normalize-space()="16 : 00"]

Solution 4:

I came across this thread when I was having my own issue similar to above.

HTML

<div class="d-flex">
<h4 class="flex-auto min-width-0 pr-2 pb-1 commit-title">
  <a href="/nsomar/OAStackView/releases/tag/1.0.1">

    1.0.1
  </a>

XPath start command

tree.xpath('//div[@class="d-flex"]/h4/a/text()')

However this grabbed random whitespace and gave me the output of:

['\n          ', '\n        1.0.1\n      ']

Using normalize-space, it removed the first blank space node and left me with just what I wanted

tree.xpath('//div[@class="d-flex"]/h4/a/text()[normalize-space()]')

['\n        1.0.1\n      ']

I could then grab the first element of the list, and use strip() to remove any further whitespace

XPath final command

tree.xpath('//div[@class="d-flex"]/h4/a/text()[normalize-space()]')[0].strip()

Which left me with exactly what I required:

1.0.1