Why are web apps going crazy with await / async nowadays? [closed]

Solution 1:

I get what the pattern is for... to run long running tasks in a separate thread.

This is absolutely not what this pattern is for.

Await does not put the operation on a new thread. Make sure that is very clear to you. Await schedules the remaining work as the continuation of the high latency operation.

Await does not make a synchronous operation into an asynchronous concurrent operation. Await enables programmers who are working with a model that is already asynchronous to write their logic to resemble synchronous workflows. Await neither creates nor destroys asynchrony; it manages existing asynchrony.

Spinning up a new thread is like hiring a worker. When you await a task, you are not hiring a worker to do that task. You are asking "is this task already done? If not, call me back when its done so I can keep doing work that depends on that task. In the meanwhile, I'm going to go work on this other thing over here..."

If you're doing your taxes and you find you need a number from your work, and the mail hasn't arrived yet, you don't hire a worker to wait by the mailbox. You make a note of where you were in your taxes, go get other stuff done, and when the mail comes, you pick up where you left off. That's await. It's asynchronously waiting for a result.

Is this excessive use of await / async something you need for web dev or for something like Angular?

It's to manage latency.

How is making every single line async going to improve performance?

In two ways. First, by ensuring that applications remain responsive in a world with high-latency operations. That kind of performance is important to users who don't want their apps to hang. Second, by providing developers with tools for expressing the data dependency relationships in asynchronous workflows. By not blocking on high-latency operations, system resources are freed up to work on unblocked operations.

To me, it'll kill performance from spinning up all those threads, no?

There are no threads. Concurrency is a mechanism for achieving asynchrony; it is not the only one.

Ok, so if I write code like: await someMethod1(); await someMethod2(); await someMethod3(); that is magically going to make the app more responsive?

More responsive compared to what? Compared to calling those methods without awaiting them? No, of course not. Compared to synchronously waiting for the tasks to complete? Absolutely, yes.

That's what I'm not getting I guess. If you awaited on all 3 at the end, then yeah, you're running the 3 methods in parallel.

No no no. Stop thinking about parallelism. There need not be any parallelism.

Think about it this way. You wish to make a fried egg sandwich. You have the following tasks:

  • Fry an egg
  • Toast some bread
  • Assemble a sandwich

Three tasks. The third task depends on the results of the first two, but the first two tasks do not depend on each other. So, here are some workflows:

  • Put an egg in the pan. While the egg is frying, stare at the egg.
  • Once the egg is done, put some toast in the toaster. Stare at the toaster.
  • Once the toast is done, put the egg on the toast.

The problem is that you could be putting the toast in the toaster while the egg is cooking. Alternative workflow:

  • Put an egg in the pan. Set an alarm that rings when the egg is done.
  • Put toast in the toaster. Set an alarm that rings when the toast is done.
  • Check your mail. Do your taxes. Polish the silverware. Whatever it is you need to do.
  • When both alarms have rung, grab the egg and the toast, put them together, and you have a sandwich.

Do you see why the asynchronous workflow is far more efficient? You get lots of stuff done while you're waiting for the high latency operation to complete. But you did not hire an egg chef and a toast chef. There are no new threads!

The workflow I proposed would be:

eggtask = FryEggAsync();
toasttask = MakeToastAsync();
egg = await eggtask;
toast = await toasttask;
return MakeSandwich(egg, toast);

Now, compare that to:

eggtask = FryEggAsync();
egg = await eggtask;
toasttask = MakeToastAsync();
toast = await toasttask;
return MakeSandwich(egg, toast);

Do you see how that workflow differs? This workflow is:

  • Put an egg in the pan and set an alarm.
  • Go do other work until the alarm goes off.
  • Get the egg out of the pan; put the bread in the toaster. Set an alarm...
  • Go do other work until the alarm goes off.
  • When the alarm goes off, assemble the sandwich.

This workflow is less efficient because we have failed to capture the fact that the toast and egg tasks are high latency and independent. But it is surely more efficient use of resources than doing nothing while you're waiting for the egg to cook.

The point of this whole thing is: threads are insanely expensive, so don't spin up new threads. Rather, make more efficient use of the thread you've got by putting it to work while you're doing high latency operations. Await is not about spinning up new threads; it is about getting more work done on one thread in a world with high latency computation.

Maybe that computation is being done on another thread, maybe it's blocked on disk, whatever. Doesn't matter. The point is, await is for managing that asynchrony, not creating it.

I'm having a difficult time understanding how asynchronous programming can be possible without using parallelism somewhere. Like, how do you tell the program to get started on the toast while waiting for the eggs without DoEggs() running concurrently, at least internally?

Go back to the analogy. You are making an egg sandwich, the eggs and toast are cooking, and so you start reading your mail. You get halfway through the mail when the eggs are done, so you put the mail aside and take the egg off the heat. Then you go back to the mail. Then the toast is done and you make the sandwich. Then you finish reading your mail after the sandwich is made. How did you do all that without hiring staff, one person to read the mail, one person to cook the egg, one to make the toast and one to assemble the sandwich? You did it all with a single worker.

How did you do that? By breaking tasks up into small pieces, noting which pieces have to be done in which order, and then cooperatively multitasking the pieces.

Kids today with their big flat virtual memory models and multithreaded processes think that this is how its always been, but my memory stretches back to the days of Windows 3, which had none of that. If you wanted two things to happen "in parallel" that's what you did: split the tasks up into small parts and took turns executing parts. The whole operating system was based on this concept.

Now, you might look at the analogy and say "OK, but some of the work, like actually toasting the toast, is being done by a machine", and that is the source of parallelism. Sure, I didn't have to hire a worker to toast the bread, but I achieved parallelism in hardware. And that is the right way to think of it. Hardware parallelism and thread parallelism are different. When you make an asynchronous request to the network subsystem to go find you a record from a database, there is no thread that is sitting there waiting for the result. The hardware achieves parallelism at a level far, far below that of operating system threads.

If you want a more detailed explanation of how hardware works with the operating system to achieve asynchrony, read "There is no thread" by Stephen Cleary.

So when you see "async" do not think "parallel". Think "high latency operation split up into small pieces" If there are many such operations whose pieces do not depend on each other then you can cooperatively interleave the execution of those pieces on one thread.

As you might imagine, it is very difficult to write control flows where you can abandon what you are doing right now, go do something else, and seamlessly pick up where you left off. That's why we make the compiler do that work! The point of "await" is that it lets you manage those asynchronous workflows by describing them as synchronous workflows. Everywhere that there is a point where you could put this task aside and come back to it later, write "await". The compiler will take care of turning your code into many tiny pieces that can each be scheduled in an asynchronous workflow.

UPDATE:

In your last example, what would be the difference between


eggtask = FryEggAsync(); 
egg = await eggtask; 
toasttask = MakeToastAsync(); 
toast = await toasttask; 

egg = await FryEggAsync(); 
toast = await MakeToastAsync();?

I assume it calls them synchronously but executes them asynchronously? I have to admit I've never even bothered to await the task separately before.

There is no difference.

When FryEggAsync is called, it is called regardless of whether await appears before it or not. await is an operator. It operates on the thing returned from the call to FryEggAsync. It's just like any other operator.

Let me say this again: await is an operator and its operand is a task. It is a very unusual operator, to be sure, but grammatically it is an operator, and it operates on a value just like any other operator.

Let me say it again: await is not magic dust that you put on a call site and suddenly that call site is remoted to another thread. The call happens when the call happens, the call returns a value, and that value is a reference to an object that is a legal operand to the await operator.

So yes,

var x = Foo();
var y = await x;

and

var y = await Foo();

are the same thing, the same as

var x = Foo();
var y = 1 + x;

and

var y = 1 + Foo();

are the same thing.

So let's go through this one more time, because you seem to believe the myth that await causes asynchrony. It does not.

async Task M() { 
   var eggtask = FryEggAsync(); 

Suppose M() is called. FryEggAsync is called. Synchronously. There is no such thing as an asynchronous call; you see a call, control passes to the callee until the callee returns. The callee returns a task which represents an egg to be made available in the future.

How does FryEggAsync do this? I don't know and I don't care. All I know is I call it, and I get an object back that represents a future value. Maybe that value is produced on a different thread. Maybe it is produced on this thread but in the future. Maybe it is produced by special-purpose hardware, like a disk controller or a network card. I don't care. I care that I get back a task.

  egg = await eggtask; 

Now we take that task and await asks it "are you done?" If the answer is yes, then egg is given the value produced by the task. If the answer is no then M() returns a Task representing "the work of M will be completed in the future". The remainder of M() is signed up as the continuation of eggtask, so when eggtask completes, it will call M() again and pick it up not from the beginning, but from the assignment to egg. M() is a resumable at any point method. The compiler does the necessary magic to make that happen.

So now we've returned. The thread keeps on doing whatever it does. At some point the egg is ready, so the continuation of eggtask is invoked, which causes M() to be called again. It resumes at the point where it left off: assigning the just-produced egg to egg. And now we keep on trucking:

toasttask = MakeToastAsync(); 

Again, the call returns a task, and we:

toast = await toasttask; 

check to see if the task is complete. If yes, we assign toast. If no, then we return from M() again, and the continuation of toasttask is *the remainder of M().

And so on.

Eliminating the task variables does nothing germane. Storage for the values is allocated; it's just not given a name.

ANOTHER UPDATE:

is there a case to be made to call Task-returning methods as early as possible but awaiting them as late as possible?

The example given is something like:

var task = FooAsync();
DoSomethingElse();
var foo = await task;
...

There is some case to be made for that. But let's take a step back here. The purpose of the await operator is to construct an asynchronous workflow using the coding conventions of a synchronous workflow. So the thing to think about is what is that workflow? A workflow imposes an ordering upon a set of related tasks.

The easiest way to see the ordering required in a workflow is to examine the data dependence. You can't make the sandwich before the toast comes out of the toaster, so you're going to have to obtain the toast somewhere. Since await extracts the value from the completed task, there's got to be an await somewhere between the creation of the toaster task and the creation of the sandwich.

You can also represent dependencies on side effects. For example, the user presses the button, so you want to play the siren sound, then wait three seconds, then open the door, then wait three seconds, then close the door:

DisableButton();
PlaySiren();
await Task.Delay(3000);
OpenDoor();
await Task.Delay(3000);
CloseDoor();
EnableButton();

It would make no sense at all to say

DisableButton();
PlaySiren();
var delay1 = Task.Delay(3000);
OpenDoor();
var delay2 = Task.Delay(3000);
CloseDoor();
EnableButton();
await delay1;
await delay2;

Because this is not the desired workflow.

So, the actual answer to your question is: deferring the await until the point where the value is actually needed is a pretty good practice, because it increases the opportunities for work to be scheduled efficiently. But you can go too far; make sure that the workflow that is implemented is the workflow you want.

Solution 2:

Generally this is because once asynchronous functions play nicer with other async functions, otherwise you start losing the benefits of asynchronicity. As a result, functions calling async functions end up being async themselves and it spreads throughout the entire application eg. if you made your interactions with a data store async, then things utilising that functionality tend to get made as async as well.

As you convert synchronous code to asynchronous code, you’ll find that it works best if asynchronous code calls and is called by other asynchronous code—all the way down (or “up,” if you prefer). Others have also noticed the spreading behavior of asynchronous programming and have called it “contagious” or compared it to a zombie virus. Whether turtles or zombies, it’s definitely true that asynchronous code tends to drive surrounding code to also be asynchronous. This behavior is inherent in all types of asynchronous programming, not just the new async/await keywords.

Source: Async/Await - Best Practices in Asynchronous Programming