A Proof of Correctness of Durstenfeld's Random Permutation Algorithm

Question: Does anyone have a precise mathematical proof of Durstenfeld's Algorithm?

The first $O(n)$ shuffle or random permutation generator was published by Richard Durstenfeld in 1964. This algorithm came to the notice of programmers because it was included in Knuth 1969, page 125, 2, as Algorithm P. A succinct statement of the algorithm is: \begin{equation} \text{for }\ k \leftarrow n,n-1,\ldots,2\ \text{ do } \begin{cases} \text{Choose at random a number}\ \ r \in [1,k] \\\ \text{Interchange }\pi[r] \text{ and }\pi[k],\\ \end{cases} \end{equation} where $\pi[1 \ldots n]$ is the array to be shuffled or randomly permuted.

This is a truly elegant algorithm, computationally and mathematically. Computationally it handles set operations by dividing the array $\pi$ into an unshuffled set $\pi[1,\ldots, k]$, and a shuffled set $\pi[k+1,\ldots, n]$. At each iteration $k$, it chooses a random element $\pi[r]$ from the unshuffled set $\pi[1,\ldots, k]$, and interchanges it with $\pi[k]$. This moves $\pi[r]$ into the new shuffled set $\pi[k,\ldots, n]$, where it remains, unmoved, for all subsequent iterations. This is done in $O(1)$ time and space. Thus, the time and space complexity of Durstenfeld's algorithm is $O(n)$, which is optimal.

Mathematically, the algorithm is elegant because it implicitly uses the well-known lemma that every permutation $\pi$ is a product of transpositions. At each iteration $k$, the algorithm chooses a random number $r_k \in [1,\ldots,k]$ and performs a transposition $(\pi[k], \pi[r_k])$. Thus the random permutation produced can be written as

$$ \pi = (\pi[n],\pi[r_n])(\pi[n-1],\pi[r_{n-1}]) \cdots (\pi[k],\pi[r_k])\cdots (\pi[2],\pi[r_2]), $$

where $(\pi[k],\pi[r_k])$ is the transposition or interchange of $\pi[k]$ and $\pi[r_k]$.

Matlab Implementation of Durstenfeld's Algorithm

function p = GRPdurG(p)
% -------------------------------------------------------------
% Randomly permute the elements of p(1:n) using Durstenfeld's 
% Random Permutation Algorithm, CACM, Vol 7, No. 7, 1964. 
% See Knuth, Section 3.4.2, TAOCP, Vol 2, 3rd Ed.
% Derek O'Connor, 8 Dec 2010.  [email protected]
%
% USE: n=10^7; p = 1:n; p = GRPdur(p);
% -------------------------------------------------------------
n = length(p);
for k = n:-1:2    
    r = 1+floor(rand*k);    % random integer between 1 and k
    t    = p(k);
    p(k) = p(r);            % Swap(p(r),p(k)).
    p(r) = t;                  
end
%  GRPdurG

A Combinatorial Conundrum

A random permutation of the integers $\{1,2, \ldots,10^7\}$ can be generated in a few seconds using the Durstenfeld Shuffle algorithm on a standard PC. The combinatorial space $\mathcal{C}$ in this case is set of all permutations of size $10^7$. The size of this sample space is \begin{equation*} |\mathcal{C}| = n! = (10^7)! \approx 10^{65,657,059} \end{equation*} This is a gigantic number and very far beyond limits of any known random number generator. That is, $|\mathcal{C}| \ggg |\mathcal{G}|$, the state space of the random number generator. For example, the period (size of state space) of the well-known Mersenne Twister RNG (MT) is about $10^{6000}$. Hence random permutation generators using MT must miss nearly all permutations in this space.

None-the-less, the Matlab function above produces random permutations of length $10^7$ in the correct statistical proportions, e.g, $1/e$ are derangements, $1/n$ are cycles, etc., yet it must miss most of the possible permutations. Hence the conundrum. The most important question to me is:

Do random permutation generators sample uniformly from the space of permutations?

assuming they use the best random number generator available. I suspect that this is a difficult mathematical problem.

See this Matlab discussion: What Does RANDPERM Miss?

Historical Note

I believe that Durstenfeld's Algorithm was incorrectly attributed to Fisher and Yates by Knuth, in (2) below. See here O'Connor Scribd Note on Fisher-Yates. The original Durstenfeld Algorithm is here: CACM Algorithm

References

  1. Durstenfeld, R. 1964: ACM Algorithm 235: Random Permutation, Communications of the ACM, Vol 7, No 7.
  2. Knuth, D. 1969, 1998: Seminumerical Algorithms 1st & 3rd Eds. The Art of Computer Programming Series, Volume 2.

Solution 1:

I think the Wikipedia article contains the proof, but in any case it is easy. The proof of the correctness of the Durstenfeld algorithm is the same as for the original procedure by Fisher and Yates, by induction on the number $n$ of elements to permute (the cases $n=0$ is trivial). One starts with the items in any order; since they are going to be randomly permuted, this order is of no importance. To get a random permutation of $n$ distinct items, the first item must be chosen uniformly at random. Once this element is chosen, the remaining elements, which we can arrange in any order (and the only difference between Fisher-Yates and Durstenfeld is which order we start out with for the remainder) must be permuted uniformly at random, which is done by applying the algorithm recursively to the list of those remaining items.

Alternatively, one can show (also by induction), that calling the random number generator $n$ times to generate an element of the Cartesian product $[n]\times[n-1]\times\cdots\times[1]$, there is a bijection between the sequences produced by the generator and the resulting permutations of the sequence, so supposing the Cartesian product is sampled uniformly at random, so is the permutation produced.

The difficulty is not the probabilistic argument, but the requirements imposed on the random generator. When producing a random permutation of a large number of items (say $n=10000$), it is very likely that the Cartesian product to be sampled is simply larger than the state space of the random generator, which means that a given generator cannot possibly generate all $n!$ permutations, let alone generate them uniformly. This has nothing to do with the algorithm used for producing the permutation. Whether this is a problem depends on what the permutation is going to be used for. If it is just to assure some mixing in a statistical procedure I think it doesn't matter if there are certain permutations that will never show up; the important point here is just that the subset of permutations that can actually be produced passes any easily executed statistical test (such as counting the number of fixed points) in the same way as the set of all permutations would. If however the procedure is used to produce a drawing for a lottery where large amounts of money are involved, it would be quite embarrassing if a player could prove (maybe by exhaustive search of the state space) that the generator used for drawing cannot possibly produce the combination he happens to hold, as for him it then is no longer a game of chance, but just a fraud. The question arises what one really means by a "random permutation".

So to answer the added question, it is clearly so that:

Any permutation generation algorithm based on a pseudo-random number generator will fail to sample uniformly from the space of all permutations of a sufficiently large fixed size; for instance, any size such that the number of permutations exceeds that of the state space of the pseudo-random number generator.