From "The Darkside" in his article Parallel Extensions to the .Net Framework we have this parallel extensions version of quicksort:

(Edit: Since the link is now dead, interested readers may find an archive of it at the Wayback Machine)

private void QuicksortSequential<T>(T[] arr, int left, int right) 
where T : IComparable<T>
    if (right > left)
        int pivot = Partition(arr, left, right);
        QuicksortSequential(arr, left, pivot - 1);
        QuicksortSequential(arr, pivot + 1, right);

private void QuicksortParallelOptimised<T>(T[] arr, int left, int right) 
where T : IComparable<T>
    const int SEQUENTIAL_THRESHOLD = 2048;
    if (right > left)
        if (right - left < SEQUENTIAL_THRESHOLD)

            QuicksortSequential(arr, left, right);
            int pivot = Partition(arr, left, right);
                () => QuicksortParallelOptimised(arr, left, pivot - 1),
                () => QuicksortParallelOptimised(arr, pivot + 1, right));

Notice that he reverts to a sequential sort once the number of items is less than 2048.

Update I now achieve better than 1.7x speedup on a dual core machine.

I thought I would try writing a parallel sorter that worked in .NET 2.0 (I think, check me on this) and that doesn't use anything other than the ThreadPool.

Here are the results of sorting a 2,000,000 element array:

Time Parallel    Time Sequential
-------------    ---------------
2854 ms          5052 ms
2846 ms          4947 ms
2794 ms          4940 ms
...              ...
2815 ms          4894 ms
2981 ms          4991 ms
2832 ms          5053 ms

Avg: 2818 ms     Avg: 4969 ms
Std: 66 ms       Std: 65 ms
Spd: 1.76x

I got a 1.76x speedup - pretty close to the optimal 2x I was hoping for - in this environment:

  1. 2,000,000 random Model objects
  2. Sorting objects by a comparison delegate that compares two DateTime properties.
  3. Mono JIT compiler version
  4. Max OS X 10.5.8 on 2.4 GHz Intel Core 2 Duo

This time I used Ben Watson's QuickSort in C#. I changed his QuickSort inner loop from:

    QuickSortSequential (beg, l - 1);
    QuickSortSequential (l + 1, end);


    ManualResetEvent fin2 = new ManualResetEvent (false);
    ThreadPool.QueueUserWorkItem (delegate {
        QuickSortParallel (l + 1, end);
        fin2.Set ();
    QuickSortParallel (beg, l - 1);
    fin2.WaitOne (1000000);
    fin2.Close ();

(Actually, in the code I do a little load balancing that does seem to help.)

I've found that running this parallel version only pays off when there are more than 25,000 items in an array (though, a minimum of 50,000 seems to let my processor breath more).

I've made as many improvements as I can think of on my little dual core machine. I would love to try some ideas on 8-way monster. Also, this work was done on a little 13" MacBook running Mono. I'm curious how others fare on a normal .NET 2.0 install.

The source code in all its ugly glory is availble here: I can clean it up if there's any interest.

For the record here is a version without lamda expressions that will compile in C#2 and .Net 2 + Parallel Extensions. This should also work with Mono with its own implementation of Parallel Extensions (from Google Summer of code 2008):

/// <summary>
/// Parallel quicksort algorithm.
/// </summary>
public class ParallelSort
    #region Public Static Methods

    /// <summary>
    /// Sequential quicksort.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="arr"></param>
    public static void QuicksortSequential<T>(T [] arr) where T : IComparable<T>
        QuicksortSequential(arr, 0, arr.Length - 1);

    /// <summary>
    /// Parallel quicksort
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="arr"></param>
    public static void QuicksortParallel<T>(T[] arr) where T : IComparable<T>
        QuicksortParallel(arr, 0, arr.Length - 1);


    #region Private Static Methods

    private static void QuicksortSequential<T>(T[] arr, int left, int right) 
        where T : IComparable<T>
        if (right > left)
            int pivot = Partition(arr, left, right);
            QuicksortSequential(arr, left, pivot - 1);
            QuicksortSequential(arr, pivot + 1, right);

    private static void QuicksortParallel<T>(T[] arr, int left, int right) 
        where T : IComparable<T>
        const int SEQUENTIAL_THRESHOLD = 2048;
        if (right > left)
            if (right - left < SEQUENTIAL_THRESHOLD)
                QuicksortSequential(arr, left, right);
                int pivot = Partition(arr, left, right);
                Parallel.Invoke(new Action[] { delegate {QuicksortParallel(arr, left, pivot - 1);},
                                               delegate {QuicksortParallel(arr, pivot + 1, right);}

    private static void Swap<T>(T[] arr, int i, int j)
        T tmp = arr[i];
        arr[i] = arr[j];
        arr[j] = tmp;

    private static int Partition<T>(T[] arr, int low, int high) 
        where T : IComparable<T>
        // Simple partitioning implementation
        int pivotPos = (high + low) / 2;
        T pivot = arr[pivotPos];
        Swap(arr, low, pivotPos);

        int left = low;
        for (int i = low + 1; i <= high; i++)
            if (arr[i].CompareTo(pivot) < 0)
                Swap(arr, i, left);

        Swap(arr, low, left);
        return left;
