Why did the author use EntityUtils.consume(httpEntity);?

I've come across EntityUtils.consume(httpEntity); and I'm not sure what it really does.

For example:

try {

    //... some code

    HttpEntity httpEntity = httpResponse.getEntity();
    BufferedReader br = new BufferedReader(new InputStreamReader(http.Entity.getContent()));
    String line;
    while ((line = br.readLine())!= null) {
        System.out.println(line);
    }
    EntityUtils.consume(httpEntity);
} catch (Exception e) {
    //code
} finally { 
    httpClient.getConnectionManager().shutdown();
}

Why did the author put in EntityUtils.consume(httpEntity); when the finally block will close the connection and garbage collector will take care of httpEntity?


Solution 1:

It really boils down to being a "good citizen" (and really knowing the contracts of HTTPClient interfaces). What EntityUtils.consume will do is release all resources held by the httpEntity, which essentially implies releasing any underlying Stream and giving the Connection object back to its pool (in the case your connection manager is a multithreaded one) or freeing the connection manager so that it can process the next request.

If you do not consume the entity, what happens really depends on what "shutting down the connection manager" means in the finally clause. Will it close pending streams / connections that have not been sent back to the pool? I'm not sure it will contractually do that (although implementation-wise I think it does). If it does not, then you may be leaking system resources (sockets etc.). What happens can also depend on a possible finalization method of the Entity object that may (if it gets executed at all) release its resources, again, not sure it is in the entity's contract to do that.

Let's suppose for a minute that the ConnectionManager actually closes all pending resources gracefully when it shuts down. Would you still need to consume the Entity? I say yes, because one month from now, someone will modify your code and make a second HTTP call in the same try/finally block, and may be unable to do so because you have not freed resources the way you should have (e.g. if you client is on a single connection pool, not freeing the first connection will make a second call fail).

So my point is : Entities are resources, and resources should be freed when they are not needed. Counting on others to free them for you at a later point may hurt you in the future. The original author may have thought along those lines.

As a side note, notice that the implementation you wrote will actually consume the reader up to the end of the underlying stream, so the consume call will actually do nothing at all, but in my opinion, this is an implementation detail (out of the top of my head, once a response stream as been completely read, the connection object is automatically released / sent back to the pool in http client). Note also that all this Consume logic is also abstracted away from you if you use the ResponseHandler mecanism the API offers. Finally, the API does not guarantee that response.getEntity will never return null, so you should check that to avoid NullPointerException.