Get webpage contents with Python?

Solution 1:

If you're writing a project which installs packages from PyPI, then the best and most common library to do this is requests. It provides lots of convenient but powerful features. Use it like this:

import requests
response = requests.get('http://hiscore.runescape.com/index_lite.ws?player=zezima')
print (response.status_code)
print (response.content)

But if your project does not install its own dependencies, i.e. is limited to things built-in to the standard library, then you should consult one of the other answers.

Solution 2:

Because you're using Python 3.1, you need to use the new Python 3.1 APIs.

Try:

urllib.request.urlopen('http://www.python.org/')

Alternately, it looks like you're working from Python 2 examples. Write it in Python 2, then use the 2to3 tool to convert it. On Windows, 2to3.py is in \python31\tools\scripts. Can someone else point out where to find 2to3.py on other platforms?

Edit

These days, I write Python 2 and 3 compatible code by using six.

from six.moves import urllib
urllib.request.urlopen('http://www.python.org')

Assuming you have six installed, that runs on both Python 2 and Python 3.

Solution 3:

If you ask me. try this one

import urllib2
resp = urllib2.urlopen('http://hiscore.runescape.com/index_lite.ws?player=zezima')

and read the normal way ie

page = resp.read()

Good luck though

Solution 4:

Mechanize is a great package for "acting like a browser", if you want to handle cookie state, etc.

http://wwwsearch.sourceforge.net/mechanize/

Solution 5:

You can use urlib2 and parse the HTML yourself.

Or try Beautiful Soup to do some of the parsing for you.