get url link from href by beautifulsoup without redirect link

Solution 1:

You can use urllib.parse package. The URL you are looking for is indeed one of the parameters of the /biz_redir, so we need to first get the 'url' parameter out of it.

from urllib.parse import urlparse, parse_qs

url = '/biz_redir?url=https%3A%2F%2Faceplumbingandrooter.com&' \
      'cachebuster=1642876680&website_link_type=website&' \
      'src_bizid=hqjCHBGnEj4nECnLJBvjQw&s=2caa69aa7350cca9ad00' \
      'f1fd1d5a6346f341dd43e1ede874aa2eaa94d6a3458f'

parsed_url = urlparse(url)
print(parse_qs(parsed_url.query)['url'][0])

This gives you full URL https://aceplumbingandrooter.com. You can then parse it further and get the netloc, here is complete code:

from urllib.parse import urlparse, parse_qs

url = '/biz_redir?url=https%3A%2F%2Faceplumbingandrooter.com&' \
      'cachebuster=1642876680&website_link_type=website&' \
      'src_bizid=hqjCHBGnEj4nECnLJBvjQw&s=2caa69aa7350cca9ad00' \
      'f1fd1d5a6346f341dd43e1ede874aa2eaa94d6a3458f'

parsed_url = urlparse(url)
new = parse_qs(parsed_url.query)['url'][0]
new = urlparse(new)
print(new.netloc)

output:

aceplumbingandrooter.com