What is the probability that the array will remain the same?

Well I had a little fun with this one. The first thing I thought of when I first read the problem was group theory (the symmetric group Sn, in particular). The for loop simply builds a permutation σ in Sn by composing transpositions (i.e. swaps) on each iteration. My math is not all that spectacular and I'm a little rusty, so if my notation is off bear with me.


Overview

Let A be the event that our array is unchanged after permutation. We are ultimately asked to find the probability of event A, Pr(A).

My solution attempts to follow the following procedure:

  1. Consider all possible permutations (i.e. reorderings of our array)
  2. Partition these permutations into disjoint sets based on the number of so-called identity transpositions they contain. This helps reduce the problem to even permutations only.
  3. Determine the probability of obtaining the identity permutation given that the permutation is even (and of a particular length).
  4. Sum these probabilities to obtain the overall probability the array is unchanged.

1) Possible Outcomes

Notice that each iteration of the for loop creates a swap (or transposition) that results one of two things (but never both):

  1. Two elements are swapped.
  2. An element is swapped with itself. For our intents and purposes, the array is unchanged.

We label the second case. Let's define an identity transposition as follows:

An identity transposition occurs when a number is swapped with itself. That is, when n == m in the above for loop.

For any given run of the listed code, we compose N transpositions. There can be 0, 1, 2, ... , N of the identity transpositions appearing in this "chain".


For example, consider an N = 3 case:

Given our input [0, 1, 2].
Swap (0 1) and get [1, 0, 2].
Swap (1 1) and get [1, 0, 2]. ** Here is an identity **
Swap (2 2) and get [1, 0, 2]. ** And another **

Note that there is an odd number of non-identity transpositions (1) and the array is changed.


2) Partitioning Based On the Number of Identity Transpositions

Let K_i be the event that i identity transpositions appear in a given permutation. Note this forms an exhaustive partition of all possible outcomes:

  • No permutation can have two different quantities of identity transpositions simultaneously, and
  • All possible permutations must have between 0 and N identity transpositions.

Thus we can apply the Law of Total Probability:

                      

Now we can finally take advantage of the the partition. Note that when the number of non-identity transpositions is odd, there is no way the array can go unchanged*. Thus:

                        

*From group theory, a permutation is even or odd but never both. Therefore an odd permutation cannot be the identity permutation (since the identity permutation is even).

3) Determining Probabilities

So we now must determine two probabilities for N-i even:

  1. Pr(K_i)
  2. Pr(A|K_i)

The First Term

The first term, Pr(K_i), represents the probability of obtaining a permutation with i identity transpositions. This turns out to be binomial since for each iteration of the for loop:

  • The outcome is independent of the results before it, and
  • The probability of creating an identity transposition is the same, namely 1/N.

Thus for N trials, the probability of obtaining i identity transpositions is:

                      

The Second Term

So if you've made it this far, we have reduced the problem to finding Pr(A|K_i) for N - i even. This represents the probability of obtaining an identity permutation given i of the transpositions are identities. I use a naive counting approach to determine the number of ways of achieving the identity permutation over the number of possible permutations.

First consider the permutations (n, m) and (m, n) equivalent. Then, let M be the number of non-identity permutations possible. We will use this quantity frequently.

                              

The goal here is to determine the number of ways a collections of transpositions can be combined to form the identity permutation. I will try to construct the general solution along side an example of N = 4.


Let's consider the N = 4 case with all identity transpositions (i.e. i = N = 4). Let X represent an identity transposition. For each X, there are N possibilities (they are: n = m = 0, 1, 2, ... , N - 1). Thus there are N^i = 4^4 possibilities for achieving the identity permutation. For completeness, we add the binomial coefficient, C(N, i), to consider ordering of the identity transpositions (here it just equals 1). I've tried to depict this below with the physical layout of elements above and the number of possibilities below:

I  =  _X_   _X_   _X_   _X_
       N  *  N  *  N  *  N  * C(4, 4) => N^N * C(N, N) possibilities

Now without explicitly substituting N = 4 and i = 4, we can look at the general case. Combining the above with the denominator found previously, we find:

                          

This is intuitive. In fact, any other value other than 1 should probably alarm you. Think about it: we are given the situation in which all N transpositions are said to be identities. What's the probably that the array is unchanged in this situation? Clearly, 1.


Now, again for N = 4, let's consider 2 identity transpositions (i.e. i = N - 2 = 2). As a convention, we will place the two identities at the end (and account for ordering later). We know now that we need to pick two transpositions which, when composed, will become the identity permutation. Let's place any element in the first location, call it t1. As stated above, there are M possibilities supposing t1 is not an identity (it can't be as we have already placed two).

I  =  _t1_   ___   _X_   _X_
       M   *  ?  *  N  *  N

The only element left that could possibly go in the second spot is the inverse of t1, which is in fact t1 (and this is the only one by uniqueness of inverse). We again include the binomial coefficient: in this case we have 4 open locations and we are looking to place 2 identity permutations. How many ways can we do that? 4 choose 2.

I  =  _t1_   _t1_   _X_   _X_ 
       M   *  1   *  N  *  N  * C(4, 2) => C(N, N-2) * M * N^(N-2) possibilities

Again looking at the general case, this all corresponds to:

                      

Finally we do the N = 4 case with no identity transpositions (i.e. i = N - 4 = 0). Since there are a lot of possibilities, it starts to get tricky and we must be careful not to double count. We start similarly by placing a single element in the first spot and working out possible combinations. Take the easiest first: the same transposition 4 times.

I  =  _t1_   _t1_   _t1_   _t1_ 
       M   *  1   *  1   *  1   => M possibilities

Let's now consider two unique elements t1 and t2. There are M possibilities for t1 and only M-1 possibilities for t2 (since t2 cannot be equal to t1). If we exhaust all arrangements, we are left with the following patterns:

I  =  _t1_   _t1_   _t2_   _t2_ 
       M   *  1   *  M-1 *  1   => M * (M - 1) possibilities   (1)st

   =  _t1_   _t2_   _t1_   _t2_
       M   *  M-1 *  1   *  1   => M * (M - 1) possibilities   (2)nd

   =  _t1_   _t2_   _t2_   _t1_
       M   *  M-1 *  1   *  1   => M * (M - 1) possibilities   (3)rd

Now let's consider three unique elements, t1, t2, t3. Let's place t1 first and then t2. As usual, we have:

I  =  _t1_   _t2_   ___   ___ 
       M   *  ?   *  ?  *  ?  

We can't yet say how many possible t2s there can be yet, and we will see why in a minute.

We now place t1 in the third spot. Notice, t1 must go there since if were to go in the last spot, we would just be recreating the (3)rd arrangement above. Double counting is bad! This leaves the third unique element t3 to the final position.

I  =  _t1_   _t2_   _t1_   _t3_ 
       M   *  ?   *  1  *   ?  

So why did we have to take a minute to consider the number of t2s more closely? The transpositions t1 and t2 cannot be disjoint permutations (i.e. they must share one (and only one since they also cannot be equal) of their n or m). The reason for this is because if they were disjoint, we could swap the order of permutations. This means we would be double counting the (1)st arrangement.

Say t1 = (n, m). t2 must be of the form (n, x) or (y, m) for some x and y in order to be non-disjoint. Note that x may not be n or m and y many not be n or m. Thus, the number of possible permutations that t2 could be is actually 2 * (N - 2).

So, coming back to our layout:

I  =  _t1_    _t2_    _t1_   _t3_ 
       M   * 2(N-2) *  1   *  ?  

Now t3 must be the inverse of the composition of t1 t2 t1. Let's do it out manually:

(n, m)(n, x)(n, m) = (m, x) 

Thus t3 must be (m, x). Note this is not disjoint to t1 and not equal to either t1 or t2 so there is no double counting for this case.

I  =  _t1_    _t2_    _t1_   _t3_ 
       M   * 2(N-2) *  1  *   1    => M * 2(N - 2) possibilities   

Finally, putting all of these together:

        

4) Putting it all together

So that's it. Work backwards, substituting what we found into the original summation given in step 2. I computed the answer to the N = 4 case below. It matches the empirical number found in another answer very closely!

         N  =  4
         M  =  6   _________ _____________ _________
                  | Pr(K_i) | Pr(A | K_i) | Product | 
         _________|_________|_____________|_________|
        |         |         |             |         |
        |  i = 0  |  0.316  |  120 / 1296 |  0.029  |
        |_________|_________|_____________|_________|
        |         |         |             |         |
        |  i = 2  |  0.211  |    6 / 36   |  0.035  |
        |_________|_________|_____________|_________|
        |         |         |             |         |
        |  i = 4  |  0.004  |    1 / 1    |  0.004  |
        |_________|_________|_____________|_________|
                            |             |         |
                            |     Sum:    |  0.068  |
                            |_____________|_________|

Correctness

It would be cool if there was a result in group theory to apply here-- and maybe there is! It would certainly help make all this tedious counting go away completely (and shorten the problem to something much more elegant). I stopped working at N = 4. For N > 5, what is given only gives an approximation (how good, I'm not sure). It is pretty clear why that is if you think about it: for example, given N = 8 transpositions, there are clearly ways of creating the identity with four unique elements which are not accounted for above. The number of ways becomes seemingly more difficult to count as the permutation gets longer (as far as I can tell...).

Anyway, I definitely couldn't do something like this within the scope of an interview. I would get as far as the denominator step if I was lucky. Beyond that, it seems pretty nasty.


Very much curious to know why these people ask so strange questions on probability?

Questions like this are asked because they allow the interviewer to gain insight into the interviewee's

  • ability read code (very simple code but at least something)
  • ability to analyse an algorithm to identify execution path
  • skills at applying logic to find possible outcomes and edge case
  • reasoning and problem solving skills as they work through the problem
  • communication and work skills - do they ask questions, or work in isolation based on information at hand

... and so on. The key to having a question that exposes these attributes of the interviewee is to have a piece of code that is deceptively simple. This shakes out the imposters the non-coder is stuck; the arrogant jump to the wrong conclusion; the lazy or sub-par computer scientist finds a simple solution and stops looking. Often, as they say, it's not whether you get the right answer but whether you impress with your thought process.


I'll attempt to answer the question, too. In an interview I'd explain myself rather than provide a one-line written answer - this is because even if my 'answer' is wrong, I am able to demonstrate logical thinking.

A will remain the same - i.e. elements in the same positions - when

  • m == n in every iteration (so that every element only swaps with itself); or
  • any element that is swapped is swapped back to its original position

The first case is the 'simple' case that duedl0r gives, the case that the array isn't altered. This might be the answer, because

what is the probability that array A remains the same?

if the array changes at i = 1 and then reverts back at i = 2, it's in the original state but it didn't 'remain the same' - it was changed, and then changed back. That might be a smartass technicality.

Then considering the chance of elements being swapped and swapped back - I think that calculation is above my head in an interview. The obvious consideration is that that does not need to be a change - change back swap, there could just as easily be a swap between three elements, swapping 1 and 2, then 2 and 3, 1 and 3 and finally 2 and 3. And continuing, there could be swaps between 4, 5 or more items that are 'circular' like this.

In fact, rather than considering the cases where the array is unchanged, it may be simpler to consider the cases where it is changed. Consider whether this problem can be mapped onto a known structure like Pascal's triangle.


This is a hard problem. I agree that it's too hard to solve in an interview, but that doesn't mean it is too hard to ask in an interview. The poor candidate won't have an answer, the average candidate will guess the obvious answer, and the good candidate will explain why the problem is too hard to answer.

I consider this an 'open-ended' question that gives the interviewer insight into the candidate. For this reason, even though it's too hard to solve during an interview, it is a good question to ask during an interview. There's more to asking a question than just checking whether the answer is right or wrong.


Below is C code to count the number of values of the 2N-tuple of indices that rand can produce and calculate the probability. Starting with N = 0, it shows counts of 1, 1, 8, 135, 4480, 189125, and 12450816, with probabilities of 1, 1, .5, .185185, .0683594, .0193664, and .00571983. The counts do not appear in the Encyclopedia of Integer Sequences, so either my program has a bug or this is a very obscure problem. If so, the problem is not intended to be solved by a job applicant but to expose some of their thought processes and how they deal with frustration. I would not regard it as a good interview problem.

#include <inttypes.h>
#include <math.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>


#define swap(a, b)  do { int t = (a); (a) = (b); (b) = t; } while (0)


static uint64_t count(int n)
{
    // Initialize count of how many times the original order is the result.
    uint64_t c = 0;

    // Allocate space for selectors and initialize them to zero.
    int *r = calloc(2*n, sizeof *r);

    // Allocate space for array to be swapped.
    int *A = malloc(n * sizeof *A);

    if (!A || !r)
    {
        fprintf(stderr, "Out of memory.\n");
        exit(EXIT_FAILURE);
    }

    // Iterate through all values of selectors.
    while (1)
    {
        // Initialize A to show original order.
        for (int i = 0; i < n; ++i)
            A[i] = i;

        // Test current selector values by executing the swap sequence.
        for (int i = 0; i < 2*n; i += 2)
        {
            int m = r[i+0];
            int n = r[i+1];
            swap(A[m], A[n]);
        }

        // If array is in original order, increment counter.
        ++c;    // Assume all elements are in place.
        for (int i = 0; i < n; ++i)
            if (A[i] != i)
            {
                // If any element is out of place, cancel assumption and exit.
                --c;
                break;
            }

        // Increment the selectors, odometer style.
        int i;
        for (i = 0; i < 2*n; ++i)
            // Stop when a selector increases without wrapping.
            if (++r[i] < n)
                break;
            else
                // Wrap this selector to zero and continue.
                r[i] = 0;

        // Exit the routine when the last selector wraps.
        if (2*n <= i)
        {
            free(A);
            free(r);
            return c;
        }
    }
}


int main(void)
{
    for (int n = 0; n < 7; ++n)
    {
        uint64_t c = count(n);
        printf("N = %d:  %" PRId64 " times, %g probabilty.\n",
            n, c, c/pow(n, 2*n));
    }

    return 0;
}

The behaviour of the algorithm can be modelled as a Markov chain over the symmetric group SN.

Basics

The N elements of the array A can be arranged in N! different permutations. Let us number these permutations from 1 to N!, e.g. by lexicographic ordering. So the state of the array A at any time in the algorithm can be fully characterized by the permutation number.

For example, for N = 3, one possible numbering of all 3! = 6 permutations might be:

  1. a b c
  2. a c b
  3. b a c
  4. b c a
  5. c a b
  6. c b a

State transition probabilities

In each step of the algorithm, the state of A either stays the same or transitions from one of these permutations to another. We are now interested in the probabilities of these state changes. Let us call Pr(ij) the probability that the state changes from permutation i to permutation j in a single loop iteration.

As we pick m and n uniformly and independently from the range [0, N-1], there are N² possible outcomes, each of which is equally likely.

Identity

For N of these outcomes m = n holds, so there is no change in the permutation. Therefore,

Pr(i→i).

Transpositions

The remaining N² - N cases are transpositions, i.e. two elements exchange their positions and therefore the permutation changes. Suppose one of these transpositions exchanges the elements at positions x and y. There are two cases how this transposition can be generated by the algorithm: either m = x, n = y or m = y, n = x. Thus, the probability for each transposition is 2 / N².

How does this relate to our permutations? Let us call two permutations i and j neighbors if and only if there is a transposition which transforms i into j (and vice versa). We then can conclude:

Pr(i→j)

Transition matrix

We can arrange the probabilities Pr(ij) in a transition matrix P ∈ [0,1]NN!. We define

pij = Pr(ij),

where pij is the entry in the i-th row and j-th column of P. Note that

Pr(ij) = Pr(ji),

which means P is symmetric.

The key point now is the observation of what happens when we multiply P by itself. Take any element p(2)ij of P²:

p(2)ij

The product Pr(ik) · Pr(kj) is the probability that starting at permutation i we transition into permutation k in one step, and transition into permutation j after another subsequent step. Summing over all in-between permutations k therefore gives us the total probability of transitioning from i to j in 2 steps.

This argument can be extended to higher powers of P. A special consequence is the following:

p(N)ii is the probability of returning back to permutation i after N steps, assuming we started at permutation i.

Example

Let's play this through with N = 3. We already have a numbering for the permutations. The corresponding transition matrix is as follows:

P

Multiplying P with itself gives:

P²

Another multiplication yields:

P³

Any element of the main diagonal gives us the wanted probability, which is 15/81 or 5/27.

Discussion

While this approach is mathematically sound and can be applied to any value of N, it is not very practical in this form. The transition matrix P has N!² entries, which becomes huge very fast. Even for N = 10 the size of the matrix already exceeds 13 trillion entries. A naive implementation of this algorithm therefore appears to be infeasible.

However, in comparison to other proposals, this approach is very structured and doesn't require complex derivations beyond figuring out which permutations are neighbors. My hope is that this structuredness can be exploited to find a much simpler computation.

For example, one could exploit the fact that all diagonal elements of any power of P are equal. Assuming we can easily calculate the trace of PN, the solution is then simply tr(PN) / N!. The trace of PN is equal to the sum of the N-th powers of its eigenvalues. So if we had an efficient algorithm to compute the eigenvalues of P, we would be set. I haven't explored this further than calculating the eigenvalues up to N = 5, however.