Compute the minimal number of swaps to order a sequence

I'm working on sorting an integer sequence with no identical numbers (without loss of generality, let's assume the sequence is a permutation of 1,2,...,n) into its natural increasing order (i.e. 1,2,...,n). I was thinking about directly swapping the elements (regardless of the positions of elements; in other words, a swap is valid for any two elements) with minimal number of swaps (the following may be a feasible solution):

Swap two elements with the constraint that either one or both of them should be swapped into the correct position(s). Until every element is put in its correct position.

But I don't know how to mathematically prove if the above solution is optimal. Anyone can help?


Solution 1:

I was able to prove this with graph-theory. Might want to add that tag in :)

Create a graph with n vertices. Create an edge from node n_i to n_j if the element in position i should be in position j in the correct ordering. You will now have a graph consisting of several non-intersecting cycles. I argue that the minimum number of swaps needed to order the graph correctly is

M = sum (c in cycles) size(c) - 1

Take a second to convince yourself of that...if two items are in a cycle, one swap can just take care of them. If three items are in a cycle, you can swap a pair to put one in the right spot, and a two-cycle remains, etc. If n items are in a cycle, you need n-1 swaps. (This is always true even if you don't swap with immediate neighbors.)

Given that, you may now be able to see why your algorithm is optimal. If you do a swap and at least one item is in the right position, then it will always reduce the value of M by 1. For any cycle of length n, consider swapping an element into the correct spot, occupied by its neighbor. You now have a correctly ordered element, and a cycle of length n-1.

Since M is the minimum number of swaps, and your algorithm always reduces M by 1 for each swap, it must be optimal.

Solution 2:

All the cycle counting is very difficult to keep in your head. There is a way that is much simpler to memorize.

First, let's go through a sample case manually.

  • Sequence: [7, 1, 3, 2, 4, 5, 6]
  • Enumerate it: [(0, 7), (1, 1), (2, 3), (3, 2), (4, 4), (5, 5), (6, 6)]
  • Sort the enumeration by value: [(1, 1), (3, 2), (2, 3), (4, 4), (5, 5), (6, 6), (0, 7)]
  • Start from the beginning. While the index is different from the enumerated index keep on swapping the elements defined by index and enumerated index. Remember: swap(0,2);swap(0,3) is the same as swap(2,3);swap(0,2)
    • swap(0, 1) => [(3, 2), (1, 1), (2, 3), (4, 4), (5, 5), (6, 6), (0, 7)]
    • swap(0, 3) => [(4, 4), (1, 1), (2, 3), (3, 2), (5, 5), (6, 6), (0, 7)]
    • swap(0, 4) => [(5, 5), (1, 1), (2, 3), (3, 2), (4, 4), (6, 6), (0, 7)]
    • swap(0, 5) => [(6, 6), (1, 1), (2, 3), (3, 2), (4, 4), (5, 5), (0, 7)]
    • swap(0, 6) => [(0, 7), (1, 1), (2, 3), (3, 2), (4, 4), (5, 5), (6, 6)]

I.e. semantically you sort the elements and then figure out how to put them to the initial state via swapping through the leftmost item that is out of place.

Python algorithm is as simple as this:

def swap(arr, i, j):
    arr[i], arr[j] = arr[j], arr[i]


def minimum_swaps(arr):
    annotated = [*enumerate(arr)]
    annotated.sort(key = lambda it: it[1])

    count = 0

    i = 0
    while i < len(arr):
        if annotated[i][0] == i:
            i += 1
            continue
        swap(annotated, i, annotated[i][0])
        count += 1

    return count

Thus, you don't need to memorize visited nodes or compute some cycle length.