Python XPath - find sibling of element containing Text A OR Text B
Fellows,
I'm scraping some download links. On the target website, each target download link sometimes appears after the text 'Application Proof (1st submission)' and sometimes after the text 'PHIP (1st revised proof)'. My current code only scrapes the links following 'Application Proof (1st submission)':
all_proofs = driver.find_elements_by_xpath("//span[contains(text(),'Application Proof (1st submission)')]/following-sibling::a[contains(.,'Full Version')]")
Is there any way to use an OR logic in this XPath to scrape in both scenarios and get a single list of links, according to the order in which they appear in the website's source code? The logic would be approximately:
all_proofs = driver.find_elements_by_xpath("//span[contains(text(),'Application Proof (1st submission)' OR 'PHIP (1st revised proof)')]/following-sibling::a[contains(.,'Full Version')]")
Unfortunately there isn't another workaround to use instead of this logic, since:
- I can't simply scrape all download links that contain 'Full Version', since some of those links do not fit my criteria, only the ones that follow the text 'Application Proof (1st submission)' OR 'PHIP (1st revised proof)' do.
- I can't scrape one list of links that follow the 'Application Proof (1st submission)' text, then scrape another list of links that follow the 'PHIP (1st revised proof)' and finally join them together, since I need this list to be in the exact same order as the links appear in the website's source code.
Would appreciate your help!
Yes, you can use OR
operator inside XPath.
Your XPath expression could be something like this:
all_proofs = driver.find_elements_by_xpath("//span[contains(text(),'Application Proof (1st submission)') or contains(text(),'PHIP (1st revised proof)')]/following-sibling::a[contains(.,'Full Version')]")