$52$ cards reciprocal sum probability

Imagine a deck of $52$ cards but instead of having suits and ranks, they only have sequential (unique) integer ranks from $1$ to $52$. You could also imagine a standard deck of $52$ cards but convert the ranks and suits to an integer number from $1$ to $52$ so that all the cards have different numbers assigned to them.

So the question is, what is the probability, if you randomly choose exactly $10$ cards from that deck without replacement, that the sum of the reciprocals of the cards ranks (from $1$ to $52$) totals exactly $1$? For example, if you chose cards $5,10,15,20,25,30,35,40,45$ and $50$, the sum of the reciprocals would only be $0.58579$... so that is too small.

As a hint, I believe if you sort the cards in ascending order, the first card (the lowest rank card) MUST be between a $2$ and $6$ (inclusive) to be a candidate solution. This is because $1/1$ is already a sum of $1$ so any additional cards will make it too large a sum. Also $7$ as a lowest card will not work cuz $1/7$ + $1/8$... + $1/16$ = $0.93$... so the lowest card MUST be a $2,3,4,5$, or $6$. I am re-running my modified simulation now with that added information to prune the state space it must check.

Also I am seeing multiple solutions so there is not just $1$ solution but the probability is likely very low, only a small fraction of $1$% I would guesstimate.

Also note than many solutions are very close to $1$ but not exactly $1$, thus making computer simulation of this type of problem more difficult. An example of a "close solution" is $2,4,13,25,35,41,47,50,51,52$ which evaluates to $0.99999995750965$. The closest sum not equal to $1$ happens with $3,4,8,14,17,22,26,29,46,47$ which evaluates to $1.00000000288991$. That is $8$ zeros after the $1$.

An interesting note... That number very close to $1$ is gotten by summing only $10$ terms. This is impressive since even summing negative powers of $2$ which converges to $1$, ($1/2$ + $1/4$ + $1/8$...), it takes $28$ terms to almost match that closeness to $1$ and $29$ terms to beat it.


The easiest thing is enumerate all the possibilities (few seconds of CPU).

There are 1431 ways to extract distinct cards such that the sum of their reciprocals is 1.

So the probability is $$ 1431 / {52\choose 10} \approx 9.0455\cdot 10^{-8} $$

Among those 1431, the first and last lexicographically are $$ \{2, 4, 16, 26, 30, 36, 39, 45, 48, 52\} $$ and $$ \{5, 6, 8, 9, 10, 12, 15, 18, 20, 24\} $$

My algorithm is very naive. The running time of the resulting program is less than 2 seconds.

Any good set (sum of reciprocals equal to $1$), can be divided in two subsets of $5$ elements, one with sum $s\le\frac{1}{2}$ and the other with sum $\frac{1}{2}\le s<1$.

There are only ${52\choose 5}=2598960$ quintets, so generating them is not a problem. In particular, I stored the $422730$ whose sum is between $\frac{1}{2}$ and $1$.

Then I generate again the quintets, looking for those with sum $s\le\frac{1}{2}$ and I checked if among those stored there were some with sum $1-s$ and with a non-overlapping set of numbers.

I stored on file the matching quintets ($180959$) and then I used another software to eliminate the duplicates (duplicates were generated because there were many cases of $\frac{1}{2}+\frac{1}{2}$ and I was too lazy to implement a filter inside the program, since I could eliminate the duplicates later with a couple of Mathematica instructions.

For what concernes the implementation, I used C++. I'm not a C++ programmer, I'm too old for that OOP, and normally I use C, but the STL contains so many data structures already implemented...)

So I used a multimap (a map which can contain multiple elements with the same key and can be searched efficiently for a specific key), where the key was a representation of the sum of 5 reciprocals and the associated data was a representation of the 5 cards.

In practice I used a 64 bit integer for the key and a 64 bit integer for the data.

Given five numbers $a$, $b$, $c$, $d$, $e$, below 52, it is easy to see that their product fits in a signed 32 bit integer, because $52\cdot51\cdot50\cdot49\cdot48=311875200 < 2^{31}$. So, if I reduce the fraction $$ \frac{1}{a}+\frac{1}{b}+\frac{1}{c}+\frac{1}{d}+\frac{1}{e}=\frac{p}{q} $$ where $(p,q)=1$, then I know that the value $p\cdot2^{32}+q$ will fit nicely in a 64 bit integer. This will be my key. When in the second part of the program I'm looking for matching quintes I will set my key to $\frac{q-p}{p}$, to search among those already stored one that added to the current one gives sum $1$.

The representation of the five numbers $a$, $b$, $c$, $d$, $e$ can be managed in various ways, but I decided to use an unsigned 64 bit number and set to $1$ the $a$-th, $b$-th, ..., and $e$-th bit. In this way it is easy to check if two quintuples have numbers in common since it is sufficient to check if the bit-wise AND is zero or not.

I think it's all.

Addendum. The algorithm above is due to my bad sight. When I computed the value ${51\choose 10}$ (the card '$1$' is useless) I did not realize it was just such a little number. I misread the exponent. Indeed it is only about $1.28\cdot 10^{10}$. With modern computers it is just a breeze. Indeed I hastily wrote a dumb C program that in about 36 seconds printed out the 1431 solutions.

Here is a short version (not to be taken as an example of good programming!)

#include<stdio.h>
int main()
{
  long double Inv[53],s0,s1,s2,s3,s4,s5,s6,s7,s8,s9,Err=6.0E-11;
  long double Min=1-Err, Max=1+Err;
  for (int i=1; i<53; i++) Inv[i]=1/(long double)i;
#define FOR(p,c) for(int i##c=52; i##c>i##p && (s##c=(s##p+Inv[i##c]))<Max;i##c--)
  int i0, cnt=0;
  for (i0=2, s0=0.5; i0<=52; s0=Inv[++i0])
  FOR(0,1) FOR(1,2) FOR(2,3) FOR(3,4)
  FOR(4,5) FOR(5,6) FOR(6,7) FOR(7,8) FOR(8,9)
  if (s9>Min) printf("%4d %.15Lf %d %d %d %d %d %d %d " 
  "%d %d %d\n",++cnt,s9, i0,i1,i2,i3,i4,i5,i6,i7,i8,i9);
  return 0; 
}

The a=6 case can be eliminated by pencil and paper.
Consider that 1/1 = 1, 1/2 = 7, 1/3 = 9, and 1/4 = 10, all mod 13. The only multiple of 13 we can get out of sums of these numbers is 7+9+10=26, which shows that 1/26+1/39+1/52=1/12 is feasible, but 1/13 must not appear in the sum.
Also 1/1 = 1, 1/2 = 6, 1/3 = 4, 1/4 = 3, all mod 11. The only multiple of 11 in this case is 1+6+4=11, so 1/44 is excluded, and if 1/11 is in the sum, then 1/22 and 1/33 must also be.
These two considerations alone mean that a=6 is impossible:

sum(1.0/[6,7,8,9,10,11,12,13,14,15]) =    1.034896
sum(1.0/[6,7,8,9,10,11,12,14,22,33]) =   0.9670634
sum(1.0/[6,7,8,9,10,12,14,15,16,17]) =   0.9883869

Similar considerations exclude a total of 20 cards. With that, along with the exclusion of card 1 by its magnitude, the deck is down to 31 cards and there are only 4 possibilities for the lowest card. Throw in an early-out test at about the seventh card and compiled code gets down to about 30 ms. The interpreted code should go through in tolerable time. Obviously not an improvement on brute force because that is quick enough and less error-prone.


My simulation is much simpler (requires much less thought). I have $10$ nested loops (a,b,c,d,e,f,g,h,i,j). a goes from $2$ to $6$ only since the lowest numbered card can only be one of those choices. Each inner loop starts at $1$ more than the loop immediately outside of it. The upper limit of the loops is $43$ for a, $44$ for b, ... $51$ for i, $52$ for j. It is running now but is slow and will have to run overnight. I am storing the winners to a database so I can study them later. I will compare results when the program finishes and I will check all the answers in the database to make sure they actually equal $1$.

My program makes a check that the sum of the reciprocals equal $1.00000000000000000$ ($17$ zeros). I think that is the limit of the precision. I doubt any near miss would be truncated at the $18$th or higher decimal digit and not be an actual $1$.

I am actually surprised there are no reported solutions with the lowest card being 6.

So this seems like one of those problems that can only practically be solved via computers, not using a paper and pencil method. At least I haven't seen that type of solution presented here yet. I will give people more time cuz it seems like a hard problem to solve that way.

$Update$: For some mysterious reason, my program missed a few solutions such as $5,6,8,9,10,12,15,18,20,24$. It is confirmed as roundoff error so I need to use a different method where I multiply all of a,b,c,d,e,f,g,h,i,j (all the card ranks), set that product as p, then check if (p/a+p/b+p/c+p/d+p/e+p/f+p/g+p/h+p/i+p/j) = p. That way there are no decimal fractions cuz I am guaranteeing that there will be no remainder in the division since all the ranks are factored in, thus can be factored out. I have to rerun this so it will take many more hours. A simple example of how this algorithm works is if we take just $4$ cards ($2$, $4$, $5$, and $20$). We know that $1/2 + 1/4 + 1/5 + 1/20$ = $1$. To check this, we multiply $2*4*5*20$ and get $800$. Now we basically convert all the reciprocals to $800$ths in the denominator so $1/2$ = $400/800 $... The just sum them up and (in this case), if they tally $800/800$, then they are a solution.

$Update 2$. I also got $1431$ now with my new algorithm. By simulating all possibilities of $10$ randomly drawn cards from the deck of $52$ but with the first card only being a $2,3,4$, or $5$, my program is showing about $7.6$ billion card combinations scanned. So now my next step (since I got the right answer I think), is to speed up the algorithm since it is not necessary to check all 7.6 billion hands since some can be "short circuited" (stopped early) cuz the sum is already too high.

$Update 3$. I put some tweaks in my program so it doesn't have to check all $7.6$ billion hands. It basically keeps a running total of the accumulated reciprocal for however many cards have been seen so far in the hand. If it is already at $1$ (or over), that hand is abandoned early and then next hand is tried. Also, with the $10$ nested loops, I check at each loop to see if the accumulated sum will exceed $1$ if the minimum value of the inner loops is added in. This would mean that no matter what the values are of the inner loops, the sum of the reciprocals with any combination of the remaining cards added in will exceed $1$ so that hand can be abandoned too. I can do this cuz my nested loops are sorted such that each inner loop starts at $1$ higher then the loop immediately outside of it.

Some examples of abandoning a hand early (for illustrative purposes).

Remember my nested loops are named a,b,c,d,e,f,g,h,i,j staring from the outermost one.

First an easy one. $2,3,4$,d,e,f,g,h,i,j. We can abandon this hand since $1/2 + 1/3 + 1/4$ is already more than $1$ so we don't care what d,e,f,g,h,i and j are so we skip all those and try $2,3,5$,d,e,f,g,h,i,j but that is also more than $1$ ($1/2+1/3+1/5$) so we abandon all the $2,3,5$ combinations...

The 2nd check I do for abandoning a hand is when the accumulated sum so far is close to $1$ (such as $0.97619$) as is the case when a,b,c (the outermost $3$ loops) are $2,3,7$. That gives $1/2$ + $1/3$ + $1/7$ = $0.97619$.... There is no way $2,3,7$ can be part of a solution since the remaining $7$ cards (at minimum) would contribute $1/46$, $1/47$, $1/48$...$1/52$. So my program checks for this and doesn't even finish the $10$ card hand in cases like this.

$Update - 4$. I put another check in to speed up the program. I check the case where at each nested loop, the sum of the reciprocals of the remaining cards cannot possibly add up to 1 because even with the maximum value of the reciprocal of the remaining card ranks, the sum will be too low (below $1$). This tweak GREATLY improved the execution speed of the interpreted program. Just looping thru all $7.9$ billion card combinations took a very long $10$ hours or so (overnight). With the "abandon hand" checks for either too high (above $1$) or too low (below $1$) before checking for equal to $1$, I was able to get the program down to only $4.5$ minutes of runtime (more than $100$ times quicker running). Also, only about $73$ million hands were actually checked for equalness to $1$. The output on my screen is fun to watch cuz what used to crawl as all $7.9$ billion hands were checked now "rips" as about only $1$ in $100$ hands (compared to before) are now checked.

$Update-5$ With the help of the exclusion list, my almost $200$ lines of interpreted code runs in $6.7$ seconds. Special thanks to all involved with putting this exclusion list together. The original unoptimized program used to take about $10$ hours which is $36,000$ seconds. The highly optimized program now takes only $6.7$ seconds which is more than $5000$ times quicker to get the same correct answer of $1431$. There still may be a few more optimizations possible. Perhaps the exclusion list is not complete?