Two seemingly unrelated puzzles have very similar solutions; what's the connection?
I think it's an interesting coincidence that the locker puzzle and this puzzle about duplicate array entries (see problem 6b) have such similar solutions. Spoiler alert! Don't read on if you want to solve these puzzles yourself first (they're two of the best puzzles I've ever seen).
In both solutions we consider a collection of labeled boxes, each with a number inside, and then "traverse through boxes" by starting at a given box and using the number inside to decide where to go next. For example, we might start at box $1$, find the number $5$ inside, proceed to box $5$, find the number $2$ inside, proceed to box $2$, and so forth.
Furthermore, in both solutions we "traverse through boxes" for the same reason: we're interested in finding cycles. More specifically, for the locker puzzle, we're interested in the question, "If we start at box $n$, how many steps does it take to get back to box $n$?", and for the duplicate array puzzle, we're interested in the question, "Does there exist $n$ such that (1) if we start at box $1$, we will eventually get to box $n$, and (2) if we start at box $n$, we will eventually get back to box $n$?"
Since the two puzzles seem quite unrelated at first glance, I'll pose the following question:
Is there a deep reason why "traversing through boxes" (described in the 2nd paragraph above) shows up in the solution for both of these puzzles?
In addition,
Are there other interesting problems for which "traversing through boxes", whether to find cycles or for any other reason, shows up in the solution?
[Edit] I mistakenly said that for the locker puzzle, we're interested in the question, "If we start at box $n$, will we eventually get back to box $n$?" Instead, the question should be, "If we start at box $n$, how many steps does it take to get back to box $n$?"
[Edit] Thanks for all the great answers so far! However, I'm still not completely convinced there's nothing more going on here...
Solution 1:
I'm not sure of a deep connection; from what I gather, it is merely the case that the locker puzzle and problem 6 have a common subproblem, which is: prove that "traversing through boxes" produces a cycle, i.e., leads us to visit a node we have already visited. In the locker puzzle, there is an extra restriction that this cycle include the starting node. This restriction holds because the locker puzzle involves a permutation, while problem 6 does not necessarily involve a permutation.
Unless I'm missing something, problems 6a and 6b can be solved simultaneously using the "traverse through boxes" technique starting at array index 1, despite the comment that "Part (b) seems to be a lot harder than part (a), and it apparently requires entirely different techniques". Since the hedge words "seems" and "apparently" were used, we can't say that the comment is false, but nonetheless, I believe it would be false without those hedge words. The key idea is that, as explained in the PDF on the locker puzzle, "player $ B_i $ never opens a locker (other than locker i) without first finding its number, so each time he opens a new locker he must find either his own number or the number of another unopened locker". The same basic reasoning applies to problem 6, with just slight modifications.
Problem 6 can be made to look a little more like the locker puzzle by taking a[1] out of the picture. We can consider b = {a[2], ... , a[n]} as the "real" array with m = n - 1 elements, and the value a[1] as the index of the start node. We can further relabel {2, ... , n} as {1, ... , m} if desired.
All we care about is finding a cycle, that is, coming across a value that is an index we have already visited. This is equivalent to finding a duplicate in array a, since the indices we visited are precisely array values in a. We can fulfill the time and memory requirements using a boolean array of size m. The boolean array allows us to see if we have visited an index yet, and of course has constant memory and is fast. (Correction: the boolean array would have linear memory, not constant. Evidently there is a better way to solve this.) We are guaranteed not to visit any array index twice, so the algorithm has linear time. If we didn't think of a boolean array, we might be tempted to use some sort of a list structure with fast insert and find, but a boolean array is faster.
To tie the problems together further, we can interpret the data as a directed graph where each vertex has outdegree 1. In the locker puzzle, the indegree for each vertex is also 1, but problem 6 has no such restriction.
The outdegree being 1 guarantees that we can traverse from one vertex to another without any ambiguity. Beyond that, we need to guarantee a cycle will occur regardless of where we start.
A permutation is partitioned into cycles, so this allows us to conclude for the locker puzzle that we will find our cycle precisely when we get back to the starting node. For problem 6, we are guaranteed to find some cycle, because otherwise we would have to go on endlessly, but the array is finite, so that is absurd.
Maybe my answer should be made more rigorous, but I hope it is at least satisfactory and accurate. If not, let me know, and I'll try to fix it. I admit that after looking at this problem for an extended period of time, my mind is getting a bit foggy..
EDIT: Oh I think I figured out how to solve 6b with constant memory. You can follow n - 1 links, thus covering n (not necessarily distinct) array indices of a, and by this time you are guaranteed to be in the middle of a cycle. Note your current array index. Then go around until you see that index again, which will give you the period of the cycle. Then using this you can find the "entry node" of the cycle, which will be your duplicate. (Use modular arithmetic.) I haven't worked it out in detail but I believe that's an accurate sketch.
Solution 2:
You might also enjoy this one, which is, I think, easier than the two you have linked to: you are starting with a pack of $n$ cards, numbered from 1 to $n$, but in a random order. You now play the following game: you look at the top card, numbered $i$, say, then you put the card into the $i$-th position from top. Repeat the whole procedure. Clearly, if at some point the top card it numbered 1, nothing interesting happens any more and the position of the cards never changes after that point. Prove that, independently of the starting configuration, after finitely many steps, the 1 will turn up on the top.
Solution 3:
Since you requested for more puzzles, here is one which is an algorithmic puzzle.
You are given an array of $\displaystyle 2n$ elements: $$a_1, a_2, \dots, a_n , b_1, b_2, \dots, b_n$$
You need to rearrange the elements of the array to look like: $$b_1, a_1, b_2, a_2, \dots, b_n , a_n$$
It is trivial by using an extra array, but the hard part of the puzzle is that you need to do it in $\displaystyle \mathcal{O}(1)$ (or to be pedantic, $\displaystyle \mathcal{O}(\log n))$ space and $\displaystyle \mathcal{O}(n)$ time.
(Assume each element of the array takes up constant space)