What is this odd sorting algorithm?
It's a correct but odd and inefficient insertion sort.
Let's first visualize it by printing A
after each complete inner loop iteration. Example:
before: [1, 12, 13, 8, 15, 18, 19, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[19, 1, 12, 8, 13, 15, 18, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 19, 12, 8, 13, 15, 18, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 12, 19, 8, 13, 15, 18, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 8, 12, 19, 13, 15, 18, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 8, 12, 13, 19, 15, 18, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 8, 12, 13, 15, 19, 18, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 8, 12, 13, 15, 18, 19, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 8, 12, 13, 15, 16, 18, 19, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 7, 8, 12, 13, 15, 16, 18, 19, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 7, 8, 11, 12, 13, 15, 16, 18, 19, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 6, 7, 8, 11, 12, 13, 15, 16, 18, 19, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 6, 7, 8, 11, 12, 13, 14, 15, 16, 18, 19, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 3, 6, 7, 8, 11, 12, 13, 14, 15, 16, 18, 19, 2, 9, 5, 4, 0, 10, 17]
[1, 2, 3, 6, 7, 8, 11, 12, 13, 14, 15, 16, 18, 19, 9, 5, 4, 0, 10, 17]
[1, 2, 3, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 18, 19, 5, 4, 0, 10, 17]
[1, 2, 3, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 18, 19, 4, 0, 10, 17]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 18, 19, 0, 10, 17]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 18, 19, 10, 17]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 19, 17]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
after: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
Graphically, but updating after every swap, the red bar shows index i
(code here):
For i = 0
, it actually does more or less jumble the list, but it brings the largest value to A[0]
(or one largest value, if there are multiple). From then on, this value will act as a sentinel.
Due to this initial jumbling, it would be hard (and pointless) to state an invariant based on the initial state of the array. Instead, let's define A0
to be the state of the array after the outer loop for i = 0
:
Invariant: After the outer loop for some i
:
- The overall largest value is at
A[i]
-
A[0 to i]
containA0[0 to i]
in sorted order.
Proof by induction:
Base case i = 0
is trivial.
Cases i > 0
: Before the outer loop with this i
, we have the sentinel (overall largest value) at A[i-1]
, and A[0 to i-1]
contains A0[0 to i-1]
in sorted order. Now the inner loop goes over all elements and we swap A[i]
and A[j]
whenever A[j] > A[i]
. Let's look at an example row from above again:
[1, 7, 8, 12, 13, 15, 16, 18, 19, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
The 19
is the sentinel, the part up to it is sorted, and i
is the index of the 11. What happens when we go with j
from 0 to the end? The values 1, 7 and 8 are not larger than the 11, so nothing happens. The 12 is larger, so it'll get swapped with the 11:
[1, 7, 8, 11, 13, 15, 16, 18, 19, 12, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
Now it's the 12 that's at index i
. And next it gets compared to 13. Since 13 is larger, they're swapped:
[1, 7, 8, 11, 12, 15, 16, 18, 19, 13, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
This continues, always swapping, until we swap with the sentinel:
[1, 7, 8, 11, 12, 13, 16, 18, 19, 15, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 7, 8, 11, 12, 13, 15, 18, 19, 16, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 7, 8, 11, 12, 13, 15, 16, 19, 18, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 7, 8, 11, 12, 13, 15, 16, 18, 19, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
At this point the insertion of the 11 into the sorted portion is complete. And the sentinel moved to index i
, where it then prevents any further swaps during the inner loop (i.e., swaps of it with elements further right). In general, for a new value (like the 11), the numbers smaller than it stay where they are, and the numbers larger than it are swapped out and back in one place to the right.
When we're all done, we've had the outer loop with i = n-1
. The invariant then tells us that A[0 to n-1]
contain A0[0 to n-1]
in sorted order. I.e., the array indeed ends up correctly sorted.
Why I call it an insertion sort: After the jumbling i = 0
, if you look at the array after every full inner loop, it's indistinguishable from insertion sort (see my big visualization block at the top). For example, the 11 simply got inserted into the sorted part on its left. It just differs by how that insertion happens. Normal insertion sort "bubbles" the 11 leftwards until its correct place, not even looking at the even smaller numbers. This algorithm here instead searches the insertion point, starting from the very left, and then inserts the 11 there and "bubbles" the larger numbers up to the sentinel rightwards. And then continues with fruitless further comparisons against the sentinel now sitting at A[i]
.
Update: smusamashah shared their fantastic visualization tool, which nicely lets us compare this algorithm with the other algorithms that this was claimed to be. Click the checkmarks on the right for Bubble Sort, Insertion Sort and Selection Sort, then click Start. You'll again see our sort is very much like insertion sort, and not at all like the others. And if you instead do Exchange Sort (sadly not included) you'll see that's more like Selection Sort (but slower because it swaps more and the tool shows all swaps).
To prove that it's correct, you have to find some sort of invariant. Something that's true during every pass of the loop.
Looking at it, after the very first pass of the inner loop, the largest element of the list will actually be in the first position.
Now in the second pass of the inner loop, i = 1
, and the very first comparison is between i = 1
and j = 0
. So, the largest element was in position 0, and after this comparison, it will be swapped to position 1.
In general, then it's not hard to see that after each step of the outer loop, the largest element will have moved one to the right. So after the full steps, we know at least the largest element will be in the correct position.
What about all the rest? Let's say the second-largest element sits at position i
of the current loop. We know that the largest element sits at position i-1
as per the previous discussion. Counter j
starts at 0. So now we're looking for the first A[j]
such that it's A[j] > A[i]
. Well, the A[i]
is the second largest element, so the first time that happens is when j = i-1
, at the first largest element. Thus, they're adjacent and get swapped, and are now in the "right" order. Now A[i]
again points to the largest element, and hence for the rest of the inner loop no more swaps are performed.
So we can say: Once the outer loop index has moved past the location of the second largest element, the second and first largest elements will be in the right order. They will now slide up together, in every iteration of the outer loop, so we know that at the end of the algorithm both the first and second-largest elements will be in the right position.
What about the third-largest element? Well, we can use the same logic again: Once the outer loop counter i
is at the position of the third-largest element, it'll be swapped such that it'll be just below the second largest element (if we have found that one already!) or otherwise just below the first largest element.
Ah. And here we now have our invariant: After k
iterations of the outer loop, the k-length sequence of elements, ending at position k-1
, will be in sorted order:
After the 1st iteration, the 1-length sequence, at position 0, will be in the correct order. That's trivial.
After the 2nd iteration, we know the largest element is at position 1, so obviously the sequence A[0]
, A[1]
is in the correct order.
Now let's assume we're at step k
, so all the elements up to position k-1
will be in order. Now i = k
and we iterate over j
. What this does is basically find the position at which the new element needs to be slotted into the existing sorted sequence so that it'll be properly sorted. Once that happens, the rest of the elements "bubble one up" until now the largest element sits at position i = k
and no further swaps happen.
Thus finally at the end of step N
, all the elements up to position N-1
are in the correct order, QED.
I'm not too sure if the above algorithm has an explicit name, but from some quick output analysis it just looks like an inefficient implementation of insertion sort, where the sorted region is from indices 0
to i
inclusive after running iteration i
.
Print Debugging
This can be verified by inspection if we put a print statement right after the inner loop:
for i from 0 to n-1:
for j from 0 to n-1:
if A[j] > A[i]:
swap A[i] and A[j]
print(A) <- add here
A = [5, 5, 0, 9, 2]
0. [9, 5, 0, 5, 2]
1. [5, 9, 0, 5, 2]
2. [0, 5, 9, 5, 2]
3. [0, 5, 5, 9, 2]
4. [0, 2, 5, 5, 9]
Proof
We can prove this by induction on i
, the outer loop. After having run iteration i
, indices 0 to i
inclusive of A
, or A[0:i]
is sorted, with A[i] = max(A)
.
Base Case: i = 0
For i = 0
, the maximum of A
will be stored at index 0
. This pretty much follows by inspection of the algorithm.
Inductive Step: i > 0
Our inductive hypothesis is that A[0:i-1]
is sorted and that A[i - 1] = max(A)
. What happens in iteration i
? Basically, we're determining where A[i]
should be placed in the sorted region (handled by the inner loop), then readjusting it.
Subcase 1: A[i] < A[j]
for some 0 <= j <= i - 1
From the above algorithm, A[j]
will be swapped with Ap = A[i]
. Notice that from our hypothesis, A[0:i-1]
was sorted. So, it follows that for the rest of the indices from j + 1 <= i
we'll be reordering our sorted region after inserting Ap
. It follows that A[0:i]
will be sorted when j = i
.
Subcase 2: A[i] >= A[j]
for all 0 <= j <= i - 1
No swaps happen in this case, and it follows that A[0:i]
is sorted from the A[0:i-1]
being sorted and the fact that A[i] >= A[i - 1]
.
Other case: j > i
Notice that, after j
reaches index i
, the maximum of A
will be back at index i
. So, for the rest of the inner loop, no swaps will be made. So, it follows that A[0:i]
will be sorted.
Because the above holds for all i < n = len(A)
, we can conclude that running iteration n - 1
will effectively sort A[0:n-1] = A
.
Verification/Improvement
From the above proof, we saw that the check for j > i
was redundant. To make the algorithm more efficient and more in-tune with the usual insertion sort, we can run the below code that will also sort the array.
for i from 0 to n-1:
for j from 0 to i: <- claim this line can be changed
if A[j] > A[i]:
swap A[i] and A[j]
It's a rather strange sort -- definitely not a Bubble Sort. At the end of each for i in range(n)
iteration, the ith element will now contain the largest element of the list A
. That is, we swap element i
with element j
whenever element j is greater than element i. Clearly by the end of the algorithm the last element will be the largest.
The key point is this: At the end of iteration i
, every element to the left of position i
(lower values of i
) must have values that are less than or equal, i.e. A[j] <= A[j+1] for j < i.
The following program attempts to demonstrate this:
from random import shuffle
def assert_partial_sort_order(A, i):
"""
Assert A[j] <= A[j+1] for j < i
"""
for j in range(i):
assert A[j] <= A[j+1]
def sort(A):
n = len(A)
print('sort before:', A)
n_swaps = 0
for i in range(n):
print('i =', i)
for j in range(n):
if A[j] > A[i]:
A[i], A[j] = A[j], A[i]
print(' swapping for j =', j)
print(' A =', A)
assert_partial_sort_order(A, i)
print('sort after:', A, '\n')
n = 10
A = list(range(n))
shuffle(A)
sort(A)
Prints:
sort before: [7, 0, 4, 2, 8, 9, 5, 1, 3, 6]
i = 0
swapping for j = 4
A = [8, 0, 4, 2, 7, 9, 5, 1, 3, 6]
swapping for j = 5
A = [9, 0, 4, 2, 7, 8, 5, 1, 3, 6]
i = 1
swapping for j = 0
A = [0, 9, 4, 2, 7, 8, 5, 1, 3, 6]
i = 2
swapping for j = 1
A = [0, 4, 9, 2, 7, 8, 5, 1, 3, 6]
i = 3
swapping for j = 1
A = [0, 2, 9, 4, 7, 8, 5, 1, 3, 6]
swapping for j = 2
A = [0, 2, 4, 9, 7, 8, 5, 1, 3, 6]
i = 4
swapping for j = 3
A = [0, 2, 4, 7, 9, 8, 5, 1, 3, 6]
i = 5
swapping for j = 4
A = [0, 2, 4, 7, 8, 9, 5, 1, 3, 6]
i = 6
swapping for j = 3
A = [0, 2, 4, 5, 8, 9, 7, 1, 3, 6]
swapping for j = 4
A = [0, 2, 4, 5, 7, 9, 8, 1, 3, 6]
swapping for j = 5
A = [0, 2, 4, 5, 7, 8, 9, 1, 3, 6]
i = 7
swapping for j = 1
A = [0, 1, 4, 5, 7, 8, 9, 2, 3, 6]
swapping for j = 2
A = [0, 1, 2, 5, 7, 8, 9, 4, 3, 6]
swapping for j = 3
A = [0, 1, 2, 4, 7, 8, 9, 5, 3, 6]
swapping for j = 4
A = [0, 1, 2, 4, 5, 8, 9, 7, 3, 6]
swapping for j = 5
A = [0, 1, 2, 4, 5, 7, 9, 8, 3, 6]
swapping for j = 6
A = [0, 1, 2, 4, 5, 7, 8, 9, 3, 6]
i = 8
swapping for j = 3
A = [0, 1, 2, 3, 5, 7, 8, 9, 4, 6]
swapping for j = 4
A = [0, 1, 2, 3, 4, 7, 8, 9, 5, 6]
swapping for j = 5
A = [0, 1, 2, 3, 4, 5, 8, 9, 7, 6]
swapping for j = 6
A = [0, 1, 2, 3, 4, 5, 7, 9, 8, 6]
swapping for j = 7
A = [0, 1, 2, 3, 4, 5, 7, 8, 9, 6]
i = 9
swapping for j = 6
A = [0, 1, 2, 3, 4, 5, 6, 8, 9, 7]
swapping for j = 7
A = [0, 1, 2, 3, 4, 5, 6, 7, 9, 8]
swapping for j = 8
A = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
sort after: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]