Why use async requests instead of using a larger threadpool?
Solution 1:
This is a very good question, and understanding it is key to understand why asynchronous IO is so important. The reason why the new async/await feature has been added to C# 5.0 is to simplify writing asynchronous code. Support for asynchronous processing on the server is not new however, it exists since ASP.NET 2.0.
Like Steve showed you, with synchronous processing, each request in ASP.NET (and WCF) takes one thread from the thread pool. The issue he demoed is a well known issue called "thread pool starvation". If you make synchronous IO on your server, the thread pool thread will remain blocked (doing nothing) for the duration of the IO. Since there is a limit in the number of threads in the thread pool, under load, this may lead in a situation where all the threads pool threads are being blocked waiting for IO, and requests starts being queued, causing an increase to response time. Since all the threads are waiting for an IO to complete, you will see a CPU occupation close to 0% (even though response times go through the roof).
What you are asking (Why can't we just use a bigger threadpool?) is a very good question. As a matter of fact, this is how most people have been solving the problem of thread pool starvation until now: just have more threads on the thread pool. Some documentation from Microsoft even indicates that as a fix for situations when thread pool starvation may occur. This is an acceptable solution, and until C# 5.0, it was much easier to do that, than rewriting your code to be fully asynchronous.
There are a few problems with the approach though:
There is no value that works in all situations: the number of thread pool threads you are going to need depends linearly on the duration of the IO, and the load on your server. Unfortunately, IO latency is mostly unpredictable. Here is an exemple: Let's say you make HTTP requests to a third party web service in your ASP.NET application, which take about 2 seconds to complete. You encounter thread pool starvation, so you decide to increase the thread pool size to, let's say, 200 threads, and then it starts working fine again. The problem is that maybe next week, the web service will have technical problems which increases their response time to 10 seconds. All of the sudden, thread pool starvation is back, because threads are blocked 5 times longer, so you now need to increase the number 5 times, to 1,000 threads.
Scalability and performance: The second problem is that if you do that, you will still use one thread per request. Threads are an expensive resource. Each managed thread in .NET requires a memory allocation of 1 MB for the stack. For a webpage making IO that last 5 seconds, and with a load of 500 requests per second, you will need 2,500 threads in your thread pool, that means 2.5 GB of memory for the stacks of threads that will sit doing nothing. Then you have the issue of context switching, that will take a heavy toll on the performance of your machine (affecting all the services on the machine, not just your web application). Even though Windows does a fairly good job at ignoring waiting threads, it is not designed to handle such a large number of threads. Remember that the highest efficiency is obtained when the number of threads running equals the number of logical CPUs on the machine (usually not more than 16).
So increasing the size of the thread pool is a solution, and people have been doing that for a decade (even in Microsoft's own products), it is just less scalable and efficient, in terms of memory and CPU usage, and you are always at the mercy of a sudden increase of IO latency that would cause starvation. Up until C# 5.0, the complexity of asynchronous code wasn't worth the trouble for many people. async/await changes everything as now, you can benefit from the scalability of asynchronous IO, and write simple code, at the same time.
More details: http://msdn.microsoft.com/en-us/library/ff647787.aspx "Use asynchronous calls to invoke Web services or remote objects when there is an opportunity to perform additional parallel processing while the Web service call proceeds. Where possible, avoid synchronous (blocking) calls to Web services because outgoing Web service calls are made by using threads from the ASP.NET thread pool. Blocking calls reduce the number of available threads for processing other incoming requests."
Solution 2:
- Async/await is not based on threads; it is based on asynchronous processing. When you do an asynchronous wait in ASP.NET, the request thread is returned to the thread pool, so there are no threads servicing that request until the async operation completes. Since request overhead is lower than thread overhead, this means async/await can scale better than the thread pool.
- The request has a count of outstanding asynchronous operations. This count is managed by the ASP.NET implementation of
SynchronizationContext
. You can read more aboutSynchronizationContext
in my MSDN article - it covers how ASP.NET'sSynchronizationContext
works and howawait
usesSynchronizationContext
.
ASP.NET asynchronous processing was possible before async/await - you could use async pages, and use EAP components such as WebClient
(Event-based Asynchronous Programming is a style of asynchronous programming based on SynchronizationContext
). Async/await also uses SynchronizationContext
, but has a much easier syntax.
Solution 3:
Imagine the threadpool as a set of workers that you have employed to do your work. Your workers run fast cpu instructions for your code.
Now your work happens to depend on another slow guy's work; the slow guy being the disk or the network. For instance, your work can have two parts, one part that has to execute before the slow guy's work, and one part that has to execute after the slow guy's work.
How would you advice your workers to do your work? Would you say to each worker - "Do this first part, then wait until that slow guy is done, and then do your second part" ? Would you increase the number of your workers because all of them seem to be waiting for that slow guy and you are not able to satisfy new customers? No!
You would instead ask each worker to do the first part and ask the slow guy to come back and drop a message in a queue when done. You would tell each worker (or perhaps a dedicated subset of workers) to look for done messages in the queue and do the second part of the work.
The smart kernel you are alluding to above is the operating systems ability to maintain such a queue for slow disk and network IO completion messages.