How does Parallel.ForEach handles cancellation or ThrowIfCancellationRequested() and exceptions

How Parallel.ForEach handles cancellation##

Your observations are correct. But everything behaves normal. Since the ParallelOptions.CancellationToken property is set, the Parallel.ForEach throws the OperationCanceledException once CancellationToken.IsCancellationRequested evaluates to true.

All framework classes that support cancellation behave like this (e.g. Task.Run). Before any expensive resource allocations are executed (expensive in memory or time), the framework checks the cancellation token multiple times during execution for the purpose of efficiency. The Parallel.ForEach e.g. has to do many of this expensive resource allocations due to all the thread management. Before each allocation step (e.g. initialization, spawning worker threads or forking, applying partitioner, invoking the action, etc) the CancellationToken.IsCancelRequested is evaluated again.

The last internal Parallel.ForEach step is joining the threads before creating the ParallelLoopResult (the return value of Parallel.ForEach). Before this operation the CancellationToken.IsCancellationRequested is again evaluated. Since you canceled the execution of the Parallel.ForEach while the Thread.Sleep(5000) is executing, you have to wait for a maximum span of 5 seconds until the framework rechecks this property and can throw the OperationCanceledException. You can test this. It will take x/1000 seconds of Thread.Sleep(x) to elapse until the MessageBox will show.

Another chance to cancel the Parallel.ForEach is delegated to the consumer. It is very likely that the consumer's action is long running and therefore requires cancellation before the end of the Parallel.ForEach is reached. As you know, the premature cancellation can be forced by (repeatedly) invoking CancellationToken.ThrowIfCancellationRequested(), which this time will make the CancellationToken throw the OperationCanceledException (and not the Parallel.ForEach).

To answer your last question why you only will see one MessageBox: in your special case you already noticed, that you are too slow to click the cancel button before the code reaches CancellationToken.ThrowIfCancellationRequested(), but able to click it before the thread wakes up from sleep. Therefore the Parallel.ForEach throws the exception (before joining threads and the creation of the ParallelLoopResult). So one exception is thrown. But even if you are fast enough to cancel the loop before reaching CancellationToken.ThrowIfCancellationRequested(), there still would be only one MessageBox showing since the loop aborts all threads, as soon an uncatched exception was thrown. To allow each thread to throw an exception you must catch each and accumulate them, before throwing them wrapped in an AggregateException. See: Microsoft Docs: How to Handle Exceptions in Parallel Loops for more details.


Edit to answer follow-up question:

For Q2, I just realized each thread has its own stack, so it won't know that it is surrounded by a try catch block, that's why there is only one exception(thrown by primary thread), is my understanding correct?

You are right when saying each thread has it's dedicated call stack. But when you write code that is supposed to be executed concurrently, than a copy of all locals is created on the heap for each thread. This is also true for try-catch blocks. Catch instructs the compiler to define a handler (instruction pointer) that is then registered to an exception handler table by the try instruction. The table is managed by the OS. The exception table maps a each handler to an exception. Each exception is mapped to a call stack. So exceptions and catch handlers are restricted to an explicit call stack. Since the handler has access to thread local memory, it must be a copy as well. This means each thread is 'aware' of its catch handlers.

Due to the dedicated call stacks and the exclusive mapping of exception to call stack and catch handler to exception (and thus also to the call stack), any exception thrown in a thread's scope (call stack) can't be caught outside the scope of the thread (when using Thread). Scope means in this case the address space that it described by the call stack (with its call frames). Unless not caught directly in the thread itself, it will crash the application. Task (when awaited either using Task.Wait or await) on the contrary, swallows all exceptions and wraps them in an AggregateException.

An exception thrown by DoParallel() will not be caught:

try 
{
  Thread thread = new Thread(() => DoParallel());
  thread.Start();
}
catch (Exception ex)
{
  // Unreachable code
}

But in the following two examples, both catch handlers are invoked to handle the exception:

try 
{
  await Task.Run(() => DoParallel());
}
catch (AggregateException ex)
{
  // Reachable code
}

or

try 
{
  var task = new Task(() => DoParallel());
  task.Start();
  task.Wait();
}
catch (AggregateException ex)
{
  // Reachable code
}

The last two examples are using the Task Parallel Library - TPL which uses a SynchronizationContext to allow threads to share context and therefore e.g. to propagate exceptions between the threads. Since Parallel.ForEach uses Task.Wait() (TPL), it is able to catch the worker thread's exception (if you didn't already caught it inside your action), to perform some cleanup (cancel other worker threads and disposal of internal resources), and then finally to propagate the OperationCanceledException to the outer scope.

So because an exception is thrown,

  • the OS interrupts the application and checks the exception table for a potential handler that was mapped to this thread by the try directive.
  • It finds one and reconstructs the context to execute the catch handler (in your case, the next catch handler is the internal handler of Parallel.ForEach). Application is still on halt - other threads are still parked.
  • This Parallel.ForEach handler performs the clean up and ends other threads before the application continues and therefore before any of the worker threads can throw additional exceptions themselves.
  • The application continues by executing the re-throw of the Parallel.ForEach catch handler.
  • Application halts again looking for an outer scope (consumer scope of Parallel.ForEach) catch handler.
  • If none was registered using try, the application will terminate with an error.

That's why there is always one exception thrown by Parallel.ForEach.


Edit to answer follow-up question Q3:

now I can press cancel button before worker thread reaches to ThrowIfCancellationRequested(), but I still get only one exception thrown by the primary thread. BUt I pressed the cancal button, token has been set to cancel, so when the secondary worker thread reaches to parOpts.CancellationToken.ThrowIfCancellationRequested();, shouldn't it throw an exception too? and this exception cannot be handled by the try catch in the primary thread(each thread has its own stack), so I should get an unhandled exception to halt the application, but it wasn't, I just get one exception thrown by primary thread, and is this exception thrown by primary thread or worker thread

for the following scenario:

try
{
   Parallel.ForEach(files, parOpts, currentFile =>
   {
      Thread.Sleep(5000);
      parOpts.CancellationToken.ThrowIfCancellationRequested(); 

   });
}
catch (OperationCanceledException ex)
{ 
   MessageBox.Show("Caught");
}

Since in this scenario you are able to cancel the Parallel.ForEach before it completes, the exception is generated on the worker thread (that executes your action delegate), the moment CancellationToken.ThrowIfCancellationRequested() is executed. Under the hood the CancellationToken.ThrowIfCancellationRequested() method simply looks like:

public void ThrowIfCancellationRequested()
{
  if (IsCancellationRequested) 
    ThrowOperationCanceledException();
}

// Throws an OCE; separated out to enable better inlining of ThrowIfCancellationRequested
private void ThrowOperationCanceledException()
{
  throw new OperationCanceledException(Environment.GetResourceString("OperationCanceled"), this);
}

As I mentioned before, the Parallel.ForEach uses Task and Task.Wait() (_TPL_) to handle threads and therefore uses a SynchronizationContext. In the scenario of TPL (or SynchronizationContext), the thread contexts are shared and no longer isolated (in contrast to Thread threads). This allows the Parallel.ForEach to catch exceptions thrown by child threads.

This means, there are no unhandled exceptions inside the Parallel.ForEach, since, as you can read in the step-by-step explanation of the exception flow, Parallel.ForEach internally catches all exceptions (possible due to TPL) to do the clean up and disposal of allocated resources and then finally to re-throw the OperationCanceledException.

When checking the exception's call stack of your Q3 code example, you will see that the origin is the worker thread and not the 'primary' Parallel.ForEach thread. You just caught the exception in the primary thread, since it contains the catch handler closest to the origin - the worker thread. Because of this, the primary thread can complete without cancellation.


Parallel.ForEach and threads

I think your understanding is wrong:

...the primary thread is also executing the statements in Parallel.ForEach, isn't it? I have a typo in the post, there is only two active threads, not three. the string[] just have two elements, so primary thread takes "first" to process and one worker thread takes "two" to process...

This is not true. To make it clear: the array in your initial example contains two strings that are supposed to simulate the work load, right? The primary thread is the thread you've created to execute the Parallel.ForEach loop using Task.Factory.StartNew(() => ProcessFiles());. This is a common practice in order to keep the UI thread responsive during a long running Parallel.ForEach. The Parallel.ForEach therefore executes on the primary thread and might create two worker threads - one for each load (or string). Might because the Parallel.ForEach actually uses tasks, that are backed up by threads. The max thread count is limited by the processor count and the TaskScheduler. Due to performance optimizations executed by the framework, the actual number of tasks must not match the number of iterated items or the value of MaxDegreeOfParallelism.

The Parallel.ForEach method may use more tasks than threads over the lifetime of its execution, as existing tasks complete and are replaced by new tasks. This gives the underlying TaskScheduler object the chance to add, change, or remove threads that service the loop. may decide to execute the action delegates on fewer threads then the MaxDegreeOfParallelism allows. (source: Microsoft Docs: Parallel.ForEach)


To generalize it and to sum it up

Assuming that the ParallelOptions.CancellationToken property is set, there are two possible scenarios:

First scenario: you did invoke the CancellationToken.ThrowIfCancellationRequested() in your action delegate after cancellation was requested, but before Parallel.ForEach internally evaluates CancellationToken.IsCancellationRequested. Now in case you surround your action code with a try-catch, then no exception leaves the worker thread. If there is no such try-catch, the Parallel.ForEach will internally catch this exception (to do some clean up). This would be on the primary thread. This exception is then re-thrown after the Parallel.ForEach has disposed allocated resources. Because you invoked CancellationToken.ThrowIfCancellationRequested() on the worker, the origin is still this worker thread. Besides a cancellation request, any exception can stop the execution of Parallel.ForEach at any time.

Second scenario: you don't explicitly call CancellationToken.ThrowIfCancellationRequested() in your action delegate or the cancellation occurred after the CancellationToken.ThrowIfCancellationRequested() method was invoked, then the next time Parallel.ForEach internally checks CancellationToken.IsCancelRequested, the exception will be thrown by the Parallel.ForEach. Parallel.ForEach always evaluates CancellationToken.IsCancelRequested before allocating any resources. Since Parallel.ForEach executes on the primary thread, the origin of this exception will be of course the primary thread. Besides a cancellation request, any exception can stop the execution of Parallel.ForEach at any time.

When the ParallelOptions.CancellationToken property is not set, then the internal Parallel.ForEach evaluations of CancellationToken.IsCancelRequested will not occur. In case of a CancellationToken.Cancel() request, the Parallel.ForEachcan not react and will continue its resource intensive work, unless there is no exception thrown by an invocation of CancellationToken.ThrowIfCancellationRequested(). Besides a cancellation request, any exception can stop the execution of Parallel.ForEach at any time.