Change the connection pool size for Python's "requests" module when in Threading
This should do the trick:
import requests.adapters
session = requests.Session()
adapter = requests.adapters.HTTPAdapter(pool_connections=100, pool_maxsize=100)
session.mount('http://', adapter)
response = session.get("/mypage")
Note: Use this solution only if you cannot control the construction of the connection pool (as described in @Jahaja's answer).
The problem is that the urllib3
creates the pools on demand. It calls the constructor of the urllib3.connectionpool.HTTPConnectionPool
class without parameters. The classes are registered in urllib3 .poolmanager.pool_classes_by_scheme
. The trick is to replace the classes with your classes that have different default parameters:
def patch_http_connection_pool(**constructor_kwargs):
"""
This allows to override the default parameters of the
HTTPConnectionPool constructor.
For example, to increase the poolsize to fix problems
with "HttpConnectionPool is full, discarding connection"
call this function with maxsize=16 (or whatever size
you want to give to the connection pool)
"""
from urllib3 import connectionpool, poolmanager
class MyHTTPConnectionPool(connectionpool.HTTPConnectionPool):
def __init__(self, *args,**kwargs):
kwargs.update(constructor_kwargs)
super(MyHTTPConnectionPool, self).__init__(*args,**kwargs)
poolmanager.pool_classes_by_scheme['http'] = MyHTTPConnectionPool
Then you can call to set new default parameters. Make sure this is called before any connection is made.
patch_http_connection_pool(maxsize=16)
If you use https connections you can create a similar function:
def patch_https_connection_pool(**constructor_kwargs):
"""
This allows to override the default parameters of the
HTTPConnectionPool constructor.
For example, to increase the poolsize to fix problems
with "HttpSConnectionPool is full, discarding connection"
call this function with maxsize=16 (or whatever size
you want to give to the connection pool)
"""
from urllib3 import connectionpool, poolmanager
class MyHTTPSConnectionPool(connectionpool.HTTPSConnectionPool):
def __init__(self, *args,**kwargs):
kwargs.update(constructor_kwargs)
super(MyHTTPSConnectionPool, self).__init__(*args,**kwargs)
poolmanager.pool_classes_by_scheme['https'] = MyHTTPSConnectionPool
Jahaja's answer already gives the recommended solution to your problem, but it does not answer what is going on or, as you asked, what this error means.
Some very detailed information about this is in urllib3
official documentation, the package requests
uses under the hood to actually perform its requests. Here are the relevant parts for your question, adding a few notes of my own and ommiting code examples since requests
have a different API:
The
PoolManager
class automatically handles creatingConnectionPool
instances for each host as needed. By default, it will keep a maximum of 10 ConnectionPool instances [Note: That'spool_connections
inrequests.adapters.HTTPAdapter()
, and it has the same default value of 10]. If you’re making requests to many different hosts it might improve performance to increase this numberHowever, keep in mind that this does increase memory and socket consumption.
Similarly, the ConnectionPool class keeps a pool of individual
HTTPConnection
instances. These connections are used during an individual request and returned to the pool when the request is complete. By default only one connection will be saved for re-use [Note: That'spool_maxsize
inHTTPAdapter()
, and requests changes the default value from 1 to 10]. If you are making many requests to the same host simultaneously it might improve performance to increase this numberThe behavior of the pooling for ConnectionPool is different from PoolManager. By default, if a new request is made and there is no free connection in the pool then a new connection will be created. However, this connection will not be saved if more than
maxsize
connections exist. This means that maxsize does not determine the maximum number of connections that can be open to a particular host, just the maximum number of connections to keep in the pool. However, if you specifyblock=True
[Note: Available aspool_block
inHTTPAdapter()
] then there can be at most maxsize connections open to a particular host
Given that, here's what happened in your case:
- All pools mentioned are CLIENT pools. You (or
requests
) have no control over any server connection pools - That warning is about
HttpConnectionPool
, i.e, the number of simultaneous connections made to the same host, so you could increasepool_maxsize
to match the number of workers/threads you're using to get rid of the warning. - Note that
requests
is already opening as many simultaneous connections as you ask for, regardless ofpool_maxsize
. If you have 100 threads, it will open 100 connections. But with the default value only 10 of them will be kept in the pool for later reuse, and 90 will be discarded after completing the request. - Thus, a larger
pool_maxsize
increases performance to a single host by reusing connections, not by increasing concurrency. - If you're dealing with multiple hosts, then you might change
pool_connections
instead. The default is 10 already, so if all your requests are to the same target host, increasing it will not have any effect on performance (but it will increase the resources used, as said in above documentation)