How to read html from a url in python 3
Solution 1:
Note that Python3 does not read the html code as a string but as a bytearray
, so you need to convert it to one with decode
.
import urllib.request
fp = urllib.request.urlopen("http://www.python.org")
mybytes = fp.read()
mystr = mybytes.decode("utf8")
fp.close()
print(mystr)
Solution 2:
Try the 'requests' module, it's much simpler.
#pip install requests for installation
import requests
url = 'https://www.google.com/'
r = requests.get(url)
r.text
more info here > http://docs.python-requests.org/en/master/
Solution 3:
urllib.request.urlopen(url).read()
should return you the raw HTML page as a string.