Black and white shirts puzzle
Solution 1:
For specific numbers of shirts and durabilities, the problem can be modeled as a finite-state absorbing Markov chain, but the number of states quickly gets out of hand. Say that a shirt has initial durability of $d$. Its durability can be anything from $0$ to $d$ and it can be clean or dirty, giving $2(d+1)$ states. If we have $s$ shirts to begin with, this gives $2^s(d+1)^s$ states. This is an overstatement since we only care how many clean white shirts of durability $a$ there are, not which whites shirts of durability $a$ are clean, but it gives a rough idea. So if we initially have $6$ white shirts of durability $9$ and $4$ black shirts of durability $4$, this gives $20^6\cdot10^4=6.4\times10^{11}$ states.
Say that we've wildly overestimated the number of states and that there are really only a million. We would still need a one million by one million matrix to solve the problem with a Markov chain. This is not practicable.
I've thought about trying to simplify it by just saying that we get a total of $m(d+1)$ white-shirt days and $n(u+1)$ black shirt days, but that doesn't really capture all the intricacies of the problem, because if we happen to pick the same white shirt repeatedly, we wear it out, and the probability of picking a white shirt in the future goes down.
Unless you have only a very few cheap shirts that are don't stand up to washing, I don't think it is practical to try to solve the problem exactly. As far as a general solution in terms of $m,n,d,u$ goes, I'd be really surprised to see one.
I think the best way to do this is by simulation. You might try different values of the variables to see if you notice any patterns.
I wrote a python script to do the simulation. I changed the formulation slightly; instead of "durability" I have "uses", the number of times the shirt can be worn. This is one more than your definition of durability. The uses are shown as blackWear
and whiteWear
in the script.
from random import choice
from collections import namedtuple
from math import sqrt
Shirt = namedtuple('Shirt', ['color', 'uses'])
colors = ('white', 'black')
whites = 6
blacks = 4
whiteWear = 4
blackWear = 6
trials = 10000
exhaust = {'white':0, 'black':0}
for _ in range(trials):
cleanShirts = whites*[Shirt('white', whiteWear)]
cleanShirts.extend( blacks*[Shirt('black', blackWear)])
shirts = {'white':whites, 'black': blacks}
clean = {'white':whites, 'black': blacks}
dirtyShirts = []
while shirts['white'] and shirts['black']:
while clean['white'] and clean['black']:
wear = choice(cleanShirts)
cleanShirts.remove(wear)
clean[wear.color] -= 1
dirtyShirts.append(wear)
for shirt in dirtyShirts:
washed = Shirt(shirt.color, shirt.uses-1)
if washed.uses > 0:
cleanShirts.append(washed)
clean[washed.color] += 1
else:
shirts[washed.color] -= 1
dirtyShirts = []
for color in colors:
exhaust[color] += shirts[color] == 0
print('trials:', trials)
for color in colors:
print(color, 'exhausted', exhaust[color])
w = exhaust['white']/trials
variance = w -w*w
sigma = sqrt(variance/trials)
delta = 1.96*sigma
print('95%% confidence: (%.6f, %.6f)' %(w-delta, w+delta))
This produced the output
trials: 10000
white exhausted 8401
black exhausted 1599
95% confidence: (0.832916, 0.847284)
The last line means that we can have $95\%$ confidence that the probability of exhausting the white shirts first lies between the two values shown.
I found this very surprising. You have a total of $24$ wearings of each color possible, and you have more white shirts than black shirts to start, but you run out of white shirts $84\%$ of the time. I've repeated the experiment a few times, with the same result, so I'm confident in its correctness.
It would be interesting to see how the probability changes with different numbers of shirts and uses. You can just change the values for whites
, blacks
, whiteWear
and blackWear
at the top of the script.
EDIT
I finally got around to running some more tests. First, I modified the python script, to make it easier to run multiple tests, and to consolidate the output. There are no substantive changes, but I'll post the new script anyway, in case you want to run some more tests of your own.
from random import choice
from collections import namedtuple
from math import sqrt
Shirt = namedtuple('Shirt', ['color', 'uses'])
Result = namedtuple('Result', ['trials', 'whites', 'blacks', 'whiteWear',
'blackWear', 'prob', 'delta'])
colors = ('white', 'black')
def tests(trials, whites, blacks, whiteWear, blackWear):
exhaust = 0
for _ in range(trials):
exhaust += test(whites, blacks, whiteWear, blackWear)
w = exhaust/trials # sample probability of running out of white shirts
variance = w -w*w
sigma = sqrt(variance/trials)
delta = 1.96*sigma
return Result(trials, whites, blacks, whiteWear, blackWear, w, delta)
def test(whites, blacks, whiteWear, blackWear):
cleanShirts = whites*[Shirt('white', whiteWear)]
cleanShirts.extend( blacks*[Shirt('black', blackWear)])
shirts = {'white':whites, 'black': blacks}
clean = {'white':whites, 'black': blacks}
dirtyShirts = []
while shirts['white'] and shirts['black']:
while clean['white'] and clean['black']:
wear = choice(cleanShirts)
cleanShirts.remove(wear)
clean[wear.color] -= 1
dirtyShirts.append(wear)
for shirt in dirtyShirts:
washed = Shirt(shirt.color, shirt.uses-1)
if washed.uses > 0:
cleanShirts.append(washed)
clean[washed.color] += 1
else:
shirts[washed.color] -= 1
dirtyShirts = []
return shirts['white'] == 0
def report(results):
for result in results:
print('%6d %6d %6d %5d %5d %11.5f %.6f'%result)
print()
trials = 10000
whiteWear = 4
blackWear = 6
results = [ ]
print('Trials Whites Blacks W_Use B_Use Probability Delta\n')
for n in range (1,5):
whites = 3*n
blacks = 2*n
results.append(tests(trials, whites, blacks, whiteWear, blackWear))
report(results)
results = []
for n in range(2,6):
whites = 2*n
blacks = n
results.append(tests(trials, whites, blacks, whiteWear, blackWear))
report(results)
results = []
for n in range (2,7):
whites = n
blacks = 2
results.append(tests(trials, whites, blacks, whiteWear, blackWear))
report(results)
In all the tests that I ran, I used $10,000$ trials, and I kept the durability at $6$ wearings for a black shirt, and $4$ for a white. Here are the results:
Trials Whites Blacks W_Use B_Use Probability Delta
10000 3 2 4 6 0.71320 0.008864
10000 6 4 4 6 0.83970 0.007191
10000 9 6 4 6 0.89370 0.006041
10000 12 8 4 6 0.92770 0.005076
10000 4 2 4 6 0.63210 0.009452
10000 6 3 4 6 0.71430 0.008854
10000 8 4 4 6 0.77090 0.008237
10000 10 5 4 6 0.81880 0.007550
10000 2 2 4 6 0.82720 0.007410
10000 3 2 4 6 0.72210 0.008780
10000 4 2 4 6 0.63030 0.009461
10000 5 2 4 6 0.55230 0.009746
10000 6 2 4 6 0.49690 0.009800
The "Probability" is the sample probability of running out of white shirts first. "Delta" gives the "sampling error", so that in the first case, where we have three white shirts and two blacks shirts, the probability of running out of white shirts first is $0.71320\pm 0.008864$ at the $95\%$ confidence level.
In the first set of trials, I tested what happens if we keep the ratio of white shirts to black shirts the same, but increase the number of shirts. We see that the dominance of quality over quantity becomes more marked, as the number of shirts increases. (I recognize that durability is not the only aspect of quality; this is just verbal shorthand.)
In the next set of trials, I increased the ratio of white shirts to black from $3:2$ to $2:1$. As I expected, the general pattern was the same, though it was more likely that we'd run out of black shirts first.
In the last set of trials, I kept the number of black shirts constant at $2$, but increased the number of white shirts. We had to get to $6$ white shirts until we were about equally likely to run out of black shirts first. It really is too close to call, especially when you take $\delta$ into account. (That The last case is so close to fifty-fifty is purely fortuitous. I just decided to run five cases for no particular reason.)
Solution 2:
This is a partial answer. Update built on top of this: as a self contained answer.
Part 1/3: Solving for any $(m,n)$ and $u,v=1,1$
Let's restate the problem for convenience
Each day a shirt is worn (uniformly randomly among all shirts not placed aside), and then placed aside. After one entire color is worn (placed aside), we do the "wash" process (possibly throwing out some shirts) and then continue wearing shirts; ("wash" returns the sided shirts, so we are starting again, just with lower durabilities on shirts.). We terminate the shirt wearing after all shirts of one color have been thrown out. That is,
Let's define that we have:
$m$ black shirts with durabilities $u_1,\dots,u_m$
$n$ white shirts with durabilities $v_1,\dots,v_n$
At the start and after every wash let $G=(u_1,\dots,u_m,v_1,\dots,v_n)$ denote our state of durabilities.
At the start ($0\text{th}$ wash), as given in the question, we have $G_0=(u_1=u,\dots,u_m=u,v_1=v,\dots,v_n=v)$. That is, all black shirts start with durability $u\gt 0$, and all white shirts with durability $v\gt 0$.
After $n\text{th}$ wash, we will have some change of durabilities $G_{n-1}\to G_{n}$. That is, all worn shirts (placed aside) will decrease in durability by $1$.
After every wash, we throw out all shirts that have durability $=0$ (and return the rest to be worn).
The questions we want to answer:
What is the probability of throwing one entire color out before the other (black or white)?
What is the probability of throwing one entire color out on day $d$ (either)?
What is the expected number of days that it'll take to throw out an entire color (either)?
Lets start solving the first probabilities
$[1]$ First wash: ($G_0 \to G_1$)
Before the first wash, we will either wear $m+n_0$ shirts or $n+m_0$ shirts, where $0\le n_0\lt n$ are white and $0\le m_0\lt m$ are black shirts worn, before the wash.
The probability to wear all black shirts and also wear $0\le n_0\lt n$ white shirts in between, is: $$ P_{n_0} \left( \begin{array}{} G_0 & \to (u_1,&\dots,&u_m,&v_1,&\dots,&v_{n_0},&v_{n_0+1},&\dots,&v_n&) \\ & \to (u-1,&\dots,&u-1,&v-1,&\dots,&v-1,&v,&\dots,&v&) \end{array} \right) =\frac{\binom{m-1+n_0}{n_0}}{\binom{m+n}{n}} $$
The probability to wear all white shirts and also wear $0\le m_0\lt m$ black shirts in between, is: $$ P_{m_0} \left( \begin{array}{} G_0 & \to (u_1,&\dots,&u_{m_0},&u_{m_0+1},&\dots,&u_m&v_1,&\dots,&v_n) \\ & \to (u-1,&\dots,&u-1,&u,&\dots,&u,&v-1,&\dots,&v-1) \end{array} \right) =\frac{\binom{n-1+m_0}{m_0}}{\binom{m+n}{m}} $$
Proof for $P(G_0\to G_1)$ in the first wash: (Skip if you understand above probabilities)
The above probabilities can be found using basic combinatorics (counting sequences):
We have $X=\text{"wearing all black shirts and }n_0\text{ white shirts"}$. We want to find:
$$P(X)=\frac{\text{No. events of type }X}{\text{No. all events}}$$
1) All possible sequences of shirt wearing
You'll be picking shirts from a uniform sequence of black and white shirts with $n+m$ total shirts, in order:
$$a_1,\dots,a_{n+m},a_i\in\{\text{"black" = 0}\text{,"white" } = 1\}$$
How many such sequences are there? (how many combinations of $m$ of $0$'s and $n$ of $1$'s ?)
The answer is: $$\binom{n+m}{m}=\binom{n+m}{n}$$
Where I assume you are familiar with combinations and binomial coefficients.
Otherwise, you can interpret the above as: "Out of $n+m$ choose $m$" $=$ "Out of $n+m$ choose $n$".
It is sufficient to choose spots for all $0$s (or $1$s) in the sequence, to determine the spots for all elements.
2) All possible events of shirt-wearing, that satisfy $X$:
The event of picking all black and $0\le n_0\lt n$ white shirts occurs if and only if when we have a sequence:
$$a_1,\dots,a_{n+m}=s_1,\dots,s_{m-1+n_0},b_1,w_1,\dots,w_{n-n_0}$$
Where there are $m-1$ black and $n_0$ white shirts among shirts $s_1\dots,s_{m-1+n_0}$, where $b_1$ is the last black shirt (that'll trigger the wash), and $w_1,\dots,w_{n-n_0}$ are the rest of the $n-n_0$ white shirts.
How many sequences like this one are there? Easy combinatorics of combinations:
$$ \binom{m-1+n_0}{n_0}=\binom{m-1+n_0}{m-1} $$
Again, the same as previous counting, we only need to permute either type and only among $s_i$ elements because $b_1$ and all $w_i$ have their spots already determined.
Now, basic probability, we put these two results together to get:
$$P(X=\text{washing all black shirts and }n_0\text{ white shirts})=\frac{\binom{m-1+n_0}{n_0}}{\binom{n+m}{m}}$$
The case for white and some black, instead of black and some white, is done exactly the same.
End of proof.
Lets answer all three questions for base case, we consider $(u,v)=(1,1)$
Lets solve the questions for simplest case: $u=v=1$ are initial durabilities.
This implies we've ran out of one entire color at wash $G_1$.
Lets answer (question 1.) : $X_m=\text{Running out of all black shirts}$, $Y_n=\text{Running out of all white shirts}$.
We want to know $P_{(u,v)}(X_m)$ and $P_{(u,v)}(Y_n)$ for $u=v=1$. That is:
$$P_{(1,1)}(X_m)=\sum_{k=0}^{n-1}P_{n_0=k}(G_0\to G_1)=\frac{\binom{m -1 + n}{n - 1}}{\binom{m + n}{ m}}=\frac{n}{n+m}$$
$$P_{(1,1)}(Y_n)=\sum_{k=0}^{n-1}P_{m_0=k}(G_0\to G_1)=\frac{\binom{m -1 + n}{m - 1}}{\binom{m + n}{ m}}=\frac{m}{n+m}$$
The sum can be computed using combinatorial identities, or simply using a wolfram query.
We are summing all probabilities in $[1]$, since:
$$X_m = \text{m black, 0 white worn OR m black, 1 white worn OR ... OR m black, }{n_0-1}\text{ white worn.}$$
And we know from basic probability, that for mutually exclusive events $x_1,x_2,x_3,\dots$:
$$P(x_1 \text{ or } x_2 \text{ or } x_3 \text{ or } \dots)=P(x_1)+P(x_2)+P(x_3)+\dots$$
Lets answer (question 2.) for base case: Probability $P_{(1,1)}(d)$ of throwing out one entire color on day $d$?
Since we wear a shirt each day, and since base case terminates at $G_1$, first wash, we can use $[1]$ directly.
So we have $m\le d \le m+n_0-1$, and $n\le d \le n+m_0-1$, for such $d$:
$$P_{(1,1)}(d_m=d)=P_{n_0=d-m}(G_0\to G_1)=\frac{\binom{m-1+n_0}{n_0}}{\binom{m+n}{n}}=\frac{\binom{d-1}{d-m}}{\binom{m+n}{n}}$$
$$P_{(1,1)}(d_n=d)=P_{m_0=d-n}(G_0\to G_1)=\frac{\binom{n-1+m_0}{m_0}}{\binom{m+n}{m}}=\frac{\binom{d-1}{d-n}}{\binom{m+n}{m}}$$
Finally, we have (by probability of mutually exclusive events):
$$P_{(1,1)}(d)=P_{(1,1)}(d_m=d)+P_{(1,1)}(d_n=d)=\frac{\binom{d-1}{d-m}}{\binom{m+n}{n}}+\frac{\binom{d-1}{d-n}}{\binom{m+n}{m}}$$
Where note that $P_{(1,1)}(d)=P_{(1,1)}(d_m=d)$ if $d\lt n$, and $P_{(1,1)}(d)=P_{(1,1)}(d_n=d)$ if $d\lt m$, only.
And obviously, $P_{(1,1)}(d)=0$ if $d<n$ and $d<m$.
We're left to answer (question 3.) for the base case, The expected number of days to throw out a color?
We are calculating the expectation of a random variable:
$$E_d = E_{d_m} + E_{d_n} = \sum_{d=m}^{m+n-1} d P_{(1,1)}(d_m=d)+\sum_{d=n}^{n+m-1} d P_{(1,1)}(d_n=d) =2\frac{n (m + n) \binom{-1 + m + n}{ n}}{(1 + m) \binom{m + n}{ n}} =\frac{2mn}{m+n}$$
So if you have $m,n$ black, white shirts with durabilities $u,v=1$, (note that in the re-stated problem, we throw a shirt away at the moment when its durability reaches $0$), you are expecting to lose a full color in $d=2mn/(m+n)$ days.
Part 2/3: Solving for any $(m,n)$ shirt count, and any $u,v \le k$ durabilities?
"What is the probability that I throw out all my white shirts before my black shirts?"
WLOG I've swapped white and black shirts.
Let $P(X;u,v)$ be probability of throwing out all $m$ black shirts first, before $n$ white shirts.
Let $u,v$ be the initial durabilities of black and white shirts respectively.
For $k=1$, I have already shown below (in initial "Partial answer" section) that:
$$P(X;1,1)=\frac{n}{m+n}$$
For $k=2$, we can show that:
$$P(X;2,1)=\sum_{n_1=0}^{n-1}\frac{\binom{m-1+n_1}{n_1}}{\binom{m+n}{m}}\frac{n-n_1}{n-n_1+m}$$
$$P(X;1,2)=\sum_{m_1=0}^{m-1}\frac{\binom{n-1+m_1}{m_1}}{\binom{m+n}{m}}\frac{n}{n-m-m_1}+\frac{n}{m+n}$$
$$ \begin{align} P(X;2,2) & = \sum_{n_1=0}^{n-1}\frac{\binom{m-1+n_1}{n_1}}{\binom{m+n}{m}}\sum_{m_2=0}^{m-1}\frac{\binom{n-1+m_2}{m_2}}{\binom{m+n}{m}}\frac{n-n_1}{n-n_1+m-m_2} \\ & + \sum_{m_1=0}^{m-1}\frac{\binom{n-1+m_1}{m_1}}{\binom{m+n}{m}}\sum_{n_2=0}^{n-1}\frac{\binom{m-1+n_2}{n_2}}{\binom{m+n}{m}}\frac{n-n_2}{n-n_2+m-m_1} \\ & + \left(\frac{n}{m+n}\right)^2 \end{align}$$
I'm not sure if we can simplify these expressions. For $k\ge3$, things get more "messier".
Also note that for some special cases, like $m=n,u=v$ we have $P(X)=P(Y)=\frac12$.
I have acquired these solutions by reducing the problem to only needing to observe $2$ different possibilities when moving $G_{n-1}\to G_n$ (See re-statement of the problem below.).
We are left with a binary tree covering all washes, then for finding a formula for some $u,v$ in terms of $m,n$, we just walk those paths of the tree that satisfy "running out of black shirts and still having at least one white shirt" at their endpoints, to get $P(m,n,u,v)$.
During the walk, we use the two probabilities found in $[1]$ down below, to find probabilities of moving from some step to another.
How to find some $P(m,n,u,v)$ for all $m,n$, given $u,v$?
The tree walks can be represented as walks on a square grid, from coordinates $(0,0)$ to $(u_0=u,v_0\le v-1)$, which represent maximums of spent durabilities, and those walk endpoints represent scenario $X$, running out of black shirts and having some white shirts.
The only permitted walks are right ($x$ walk) and up ($y$ walk), since we are always decreasing the durabilities (per wash).
For example, If you look at the $P(X;2,2)$ formula, we have walks $xyx+yxx+xx$ to reach $(1,2),(1,2),(0,2)$ endpoints. Because, when you walk $x$, you can then walk either $x$ or $y$ again. Then $xx$ ends the walk, and $xy$ permits only step $x$ now; Giving $xx+xyx$. And when you walk $y$, the only moves left are $xx$ giving $yxx$ walk.
Then sum these three walks to get the formula for $P(X;2,2)$.
For example, for $xx$ walk we have (See $[1]$ and how we calculated the base case, below):
$$ \text{"xx"} = \sum_{n_1=0}^{n-1}P_{n_1}\sum_{n_2=0}^{n-1}P_{n_2}=\sum_{n_1=0}^{n-1}\frac{\binom{m-1+n_1}{n_1}}{\binom{m+n}{n}} \sum_{n_2=0}^{n-1}\frac{\binom{m-1+n_2}{n_2}}{\binom{m+n}{n}}=\left(\frac{n}{m+n}\right)^2$$
For $yxx$ walk, we have, ( where $P'$ indicates durabilities depend on previous steps, since we have a possibility of throwing out some shirts, before running out of all black shirts):
$$\begin{align}\text{"yxx"} &= \sum_{m_1=0}^{m-1}P_{m_1}\sum_{n_2=0}^{n-1}P_{n_2}\sum_{n_3=0}^{n-n_2-1}P'_{n_3} \\ &= \sum_{m_1=0}^{m-1}\frac{\binom{n-1+m_1}{m_1}}{\binom{m+n}{m}}\sum_{n_2=0}^{n-1}\frac{\binom{m-1+n_2}{n_2}}{\binom{m+n}{m}}\sum_{n_3=0}^{n-n_2-1}\frac{\binom{m-m_1-1+n_3}{n_3}}{\binom{m-m_1+n-n_2}{m-m_1}} \\ &= \sum_{m_1=0}^{m-1}\frac{\binom{n-1+m_1}{m_1}}{\binom{m+n}{m}}\sum_{n_2=0}^{n-1}\frac{\binom{m-1+n_2}{n_2}}{\binom{m+n}{m}}\frac{n-n_2}{n-n_2+m-m_1} \end{align}$$
Do similarly for $xyx$ walk; and we now have $P(X;2,2)$.
I'm not sure how to simplify these sums.
Going for $k\ge 2$ in the similar way?
For $k\ge3$ you can do the same method to try to find formulas for any $u,v$ in terms of $m,n$.
But already at $k=3$ it gets one layer trickier since we have scenarios of a more intricate dependance $P''$: For example, having to split $n_2$ into $n_2^{(1)}+n_2^{(2)}$ as we can have now three kinds of durabilities $v,v-1,v-2$ we need to keep track of for white shirts.
Combining this approach into a way to compute $P(m,n,u,v)$ exactly, for all values?
Perhaps an algorithm can be made to walk the tree/grid, and then compute the walks for given $u,v$ case, and then use it to compute the probability for given $m,n$.
I'll need to look into the more intricate dependances that keep getting needed to be split into more scenarios every time $k$ (the bound on maximal $u,v$) grows.
But, if these sums can't be simplified for large $k$, then for every increase of $u,v$, the computations for exact probabilities will get heavier, and I'm not sure how practical the algorithm would be, to go for exact probabilities for large $u,v$.
Note that all of this was done with just elementary combinatorics and probability, so far. - Maybe some stronger tool can help with this further?
Part 3/3: Generalizing the previous part:
I've been playing with this a bit and generalized my method and shown how we can reach exact probabilities with "elementary combinatorics": (But at what cost?):
Generalizing the idea of walking the binary tree of scenarios.