Counting inversions in an array

I'm designing an algorithm to do the following: Given array A[1... n], for every i < j, find all inversion pairs such that A[i] > A[j]. I'm using merge sort and copying array A to array B and then comparing the two arrays, but I'm having a difficult time seeing how I can use this to find the number of inversions. Any hints or help would be greatly appreciated.


So here is O(n log n) solution in java.

long merge(int[] arr, int[] left, int[] right) {
    int i = 0, j = 0, count = 0;
    while (i < left.length || j < right.length) {
        if (i == left.length) {
            arr[i+j] = right[j];
            j++;
        } else if (j == right.length) {
            arr[i+j] = left[i];
            i++;
        } else if (left[i] <= right[j]) {
            arr[i+j] = left[i];
            i++;                
        } else {
            arr[i+j] = right[j];
            count += left.length-i;
            j++;
        }
    }
    return count;
}

long invCount(int[] arr) {
    if (arr.length < 2)
        return 0;

    int m = (arr.length + 1) / 2;
    int left[] = Arrays.copyOfRange(arr, 0, m);
    int right[] = Arrays.copyOfRange(arr, m, arr.length);

    return invCount(left) + invCount(right) + merge(arr, left, right);
}

This is almost normal merge sort, the whole magic is hidden in merge function. Note that while sorting algorithm remove inversions. While merging algorithm counts number of removed inversions (sorted out one might say).

The only moment when inversions are removed is when algorithm takes element from the right side of an array and merge it to the main array. The number of inversions removed by this operation is the number of elements left from the the left array to be merged. :)

Hope it's explanatory enough.


I've found it in O(n * log n) time by the following method.

  1. Merge sort array A and create a copy (array B)
  2. Take A[1] and find its position in sorted array B via a binary search. The number of inversions for this element will be one less than the index number of its position in B since every lower number that appears after the first element of A will be an inversion.

    2a. accumulate the number of inversions to counter variable num_inversions.

    2b. remove A[1] from array A and also from its corresponding position in array B

  3. rerun from step 2 until there are no more elements in A.

Here’s an example run of this algorithm. Original array A = (6, 9, 1, 14, 8, 12, 3, 2)

1: Merge sort and copy to array B

B = (1, 2, 3, 6, 8, 9, 12, 14)

2: Take A[1] and binary search to find it in array B

A[1] = 6

B = (1, 2, 3, 6, 8, 9, 12, 14)

6 is in the 4th position of array B, thus there are 3 inversions. We know this because 6 was in the first position in array A, thus any lower value element that subsequently appears in array A would have an index of j > i (since i in this case is 1).

2.b: Remove A[1] from array A and also from its corresponding position in array B (bold elements are removed).

A = (6, 9, 1, 14, 8, 12, 3, 2) = (9, 1, 14, 8, 12, 3, 2)

B = (1, 2, 3, 6, 8, 9, 12, 14) = (1, 2, 3, 8, 9, 12, 14)

3: Rerun from step 2 on the new A and B arrays.

A[1] = 9

B = (1, 2, 3, 8, 9, 12, 14)

9 is now in the 5th position of array B, thus there are 4 inversions. We know this because 9 was in the first position in array A, thus any lower value element that subsequently appears would have an index of j > i (since i in this case is again 1). Remove A[1] from array A and also from its corresponding position in array B (bold elements are removed)

A = (9, 1, 14, 8, 12, 3, 2) = (1, 14, 8, 12, 3, 2)

B = (1, 2, 3, 8, 9, 12, 14) = (1, 2, 3, 8, 12, 14)

Continuing in this vein will give us the total number of inversions for array A once the loop is complete.

Step 1 (merge sort) would take O(n * log n) to execute. Step 2 would execute n times and at each execution would perform a binary search that takes O(log n) to run for a total of O(n * log n). Total running time would thus be O(n * log n) + O(n * log n) = O(n * log n).

Thanks for your help. Writing out the sample arrays on a piece of paper really helped to visualize the problem.


I wonder why nobody mentioned binary-indexed trees yet. You can use one to maintain prefix sums on the values of your permutation elements. Then you can just proceed from right to left and count for every element the number of elements smaller than it to the right:

def count_inversions(a):
  res = 0
  counts = [0]*(len(a)+1)
  rank = { v : i+1 for i, v in enumerate(sorted(a)) }
  for x in reversed(a):
    i = rank[x] - 1
    while i:
      res += counts[i]
      i -= i & -i
    i = rank[x]
    while i <= len(a):
      counts[i] += 1
      i += i & -i
  return res

The complexity is O(n log n), and the constant factor is very low.