Are urllib2 and httplib thread safe?

Solution 1:

httplib and urllib2 are not thread-safe.

urllib2 does not provide serialized access to a global (shared) OpenerDirector object, which is used by urllib2.urlopen().

Similarly, httplib does not provide serialized access to HTTPConnection objects (i.e. by using a thread-safe connection pool), so sharing HTTPConnection objects between threads is not safe.

I suggest using httplib2 or urllib3 as an alternative if thread-safety is required.

Generally, if a module's documentation does not mention thread-safety, I would assume it is not thread-safe. You can look at the module's source code for verification.

When browsing the source code to determine whether a module is thread-safe, you can start by looking for uses of thread synchronization primitives from the threading or multiprocessing modules, or use of queue.Queue.

UPDATE

Here is a relevant source code snippet from urllib2.py (Python 2.7.2):

_opener = None
def urlopen(url, data=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
    global _opener
    if _opener is None:
        _opener = build_opener()
    return _opener.open(url, data, timeout)

def install_opener(opener):
    global _opener
    _opener = opener

There is an obvious race condition when concurrent threads call install_opener() and urlopen().

Also, note that calling urlopen() with a Request object as the url parameter may mutate the Request object (see the source for OpenerDirector.open()), so it is not safe to concurrently call urlopen() with a shared Request object.

All told, urlopen() is thread-safe if the following conditions are met:

  • install_opener() is not called from another thread.
  • A non-shared Request object, or string is used as the url parameter.