Python: Get HTTP headers from urllib2.urlopen call?

Use the response.info() method to get the headers.

From the urllib2 docs:

urllib2.urlopen(url[, data][, timeout])

...

This function returns a file-like object with two additional methods:

  • geturl() — return the URL of the resource retrieved, commonly used to determine if a redirect was followed
  • info() — return the meta-information of the page, such as headers, in the form of an httplib.HTTPMessage instance (see Quick Reference to HTTP Headers)

So, for your example, try stepping through the result of response.info().headers for what you're looking for.

Note the major caveat to using httplib.HTTPMessage is documented in python issue 4773.


What about sending a HEAD request instead of a normal GET request. The following snipped (copied from a similar question) does exactly that.

>>> import httplib
>>> conn = httplib.HTTPConnection("www.google.com")
>>> conn.request("HEAD", "/index.html")
>>> res = conn.getresponse()
>>> print res.status, res.reason
200 OK
>>> print res.getheaders()
[('content-length', '0'), ('expires', '-1'), ('server', 'gws'), ('cache-control', 'private, max-age=0'), ('date', 'Sat, 20 Sep 2008 06:43:36 GMT'), ('content-type', 'text/html; charset=ISO-8859-1')]