Probability that a stick randomly broken in five places can form a tetrahedron

Edit (June. 2015) This question has been moved to MathOverflow, where a recent write-up finds a similar approximation as leonbloy's post below; see here.


Randomly break a stick in five places.

Question: What is the probability that the resulting six pieces can form a tetrahedron?

Clearly satisfying the triangle inequality on each face is a necessary but not sufficient condition; an example is provided below.

Furthermore, another commenter kindly points to a reference that may be of help in resolving this problem. In particular, it relates the question of when six numbers can be edges of a tetrahedron to a certain $5 \times 5$ determinant.

Finally, a third commenter points out that since one such construction is possible, there is an admissible neighborhood around this arrangement, so that the probability is in fact positive.

In any event, this problem is far harder than the classic $2D$ "form a triangle" one.

Several numerical attacks can be found below; I will be grateful if anyone can provide an exact solution.


Not an answer, but it might help to progress further. [The derivation that follows provides a strict -but practically useless- bound. The second part of the answer has some results that might be of interest, but they are merely empirical]


Let's consider the (much more restricted) event that the six lengths form a tetrahedron in whatever order. In the linked pdf this set of lengths is called "completely tetrahedral", and a necessary-sufficient condition is given (Theorem 4.2) which is equivalent to the following: $ u \le \sqrt{2}v $, where $u,v$ are the maximum and minimum lengths. This, informally, would correspond to "almost regular" tetrahedra. Let's then compute the probability that the lengths are completely tetrahedral. Because the points are chosen at random, uniformly, the lengths are probabilistically equivalent to a set of iid exponential variables with arbitrary parameter, conditioned to a constant sum. Because we are only interested in ratios, we can even ignore this conditioning. Now, the joint probability of the maximum and minimum of a set of $n$ iid variables is given by

$$ f_{u,v}= n(n-1) f(u) f(v) [F(u) -F(v)]^{n-2}, \hskip{1cm} u\ge v$$

In our case: $n=6$, $f(u)=e^{-u}$, and the probability that $u<a \, v$ is a straightforward integral, which gives:

$$P(u<a \, v)= 5 (a -1)\left( \frac{1}{1+5\,a}-\frac{4}{2+4\,a}+\frac{6}{3+3\,a}-\frac{4}{4+2\,a}+\frac{1}{5+1\,a} \right)$$

And $P(u<\sqrt{2} v) \approx 7.46 \times 10^{-5}$ This should be a strict bound on the desired probability, but, surely, far from tight.

[Update: indeed, the bound is indeed practically useless, it corresponds to an irrelevant tail. The probability, as per my simulations, is around $p=0.06528$ ($N=10^9$ tries, $3 \, \sigma \approx 2.3 \times 10^{-5}$), which agrees with other results.]


The only empirical result that might be of interest: It's easy to see that, from the $6!$ possible permutations of the lenghts, we can restrict ourselves to $30$, from symmetry considerations; now, from my simulations, I've found that it's sufficient to consider 7 permutations, the first two being already enough for more than $90\%$ of the successes; and the (need to consider the) seventh one is extremely small. These permutations are:

$$p_1 = [0 , 1 , 4 , 5 , 3 , 2] \hskip{1 cm} (0.75837)\\ p_2 = [0 , 1 , 4 , 3 , 5 , 2] \hskip{1 cm} (0.15231)\\ p_3 = [0 , 2 , 4 , 1 , 5 , 3] \hskip{1 cm} (0.08165)\\ p_4 = [0 , 1 , 4 , 5 , 2 , 3] \hskip{1 cm} (0.00404)\\ p_5 = [0 , 1 , 4 , 2 , 5 , 3] \hskip{1 cm} (0.00245)\\ p_6 = [0 , 1 , 3 , 5 , 4 , 2] \hskip{1 cm} (0.00116)\\ p_7 = [0 , 1 , 3 , 4 , 5 , 2] \hskip{1 cm} (0.00002)\\ $$

The length indexes correspond to a sorted array (say, ascending), and following the convention of the linked paper: the first three sides have a common vertex, the following three are the corresponding opposite sides (so, for example, in the first permutation, and by far the most favorable one, the longest and shortest sides are opposite). The numbers on the right are the probability that this permutation (when one tries in the above order) is the successful one (given that they form a tetrahedron). I cannot be totally sure if there is some rare case that requires other permutation (very improbable, I'd say), but I'm quite sure (unless I've made some mistake) that the set cannot be further reduced.


This is not an answer but a long comment to other person's request of details.

Following is some details of my simulation. I hope this will be useful for those who are interested in numerics.

On a tetrahedron, I will use 6 variables $a,b,c,A,B,C$ to represent the lengths of 6 edges. $a,b,c$ corresponds to the edges connected to an arbitrary vertex and $A,B,C$ are lengths of corresponding opposite edges.

Step 1 Pre-generate the set of 120 permutations of 5 symbols and filter away equivalent ones down to 30. $$\begin{align} &S\; = \operatorname{Perm}(\{\,1,2,3,4,5\,\})\\ \xrightarrow{\text{filter}} & S' = \{\, \pi \in S : \pi(1) = \min(\pi(1),\pi(2),\pi(4),\pi(5))\, \} \end{align}$$

The filtering condition corresponds to the fact once a pair of opposite edge $(a,A)$ is chosen, there is a 4-fold symmetry in assigning the remaining 2 pairs of opposite edges. Following 4 assignments of lengths leads to equivalent tetrahedra.

$$(a,b,c,A,B,C) \equiv (a,c,b,A,C,B) \equiv (a,B,C,A,b,c) \equiv (a,C,B,A,c,b)$$

Step 2 Draw 5 uniform random numbers from $[0,1]$, sort them and turn them into 6 lengths:
$$\begin{align}& X_i = \operatorname{Rand}(0,1), i = 1,\ldots 5\\ \xrightarrow{\text{sort} X_i} & 0 \le X_1 \le \ldots \le X_5 \le 1\\ \xrightarrow{Y_i = X_{i+1}-X_i} &Y_0 = X_1,\,Y_1 = X_2-X_1,\, \ldots,\, Y_5 = 1-X_5 \end{align}$$

Step 3 Loop through the list of permuation in $S'$, for each permutation $\pi$, assign the 6 lengths to the 6 edges:

$$(Y_0, Y_{\pi(1)}, Y_{\pi(2)}, \ldots, Y_{\pi(5)}) \longrightarrow (a, b, c, A, B, C )$$

Verify whether this assignment generate a valid teterhedron by checking:

  • All faces satisfies the triangular inequality. This can be compactly represented as: $$\min(a+b+c,a+B+C,A+b+C,A+B+c) > \max(a+A,b+B,c+C)$$
  • The corresponding Cayler-Menger determinant is positive. Up to a scaling factor, this is:

$$\left|\begin{matrix}0 & 1 & 1 & 1 & 1\cr 1 & 0 & {a}^{2} & {b}^{2} & {c}^{2}\cr 1 & {a}^{2} & 0 & {C}^{2} & {B}^{2}\cr 1 & {b}^{2} & {C}^{2} & 0 & {A}^{2}\cr 1 & {c}^{2} & {B}^{2} & {A}^{2} & 0\end{matrix}\right| > 0$$

If this configuration is admissible, record it and goes to Step 2. If not, try other permutations from $S'$.

Some comment about whether this is useful for exact answer.

A $N = 10^9$ simulation is definitely not enough. The probability of forming a tetrahedron is about $p = 0.065$. Such a simulation will give us a number accurate to about $\sqrt{\frac{p(1-p)}{N}} \sim 1.5 \times 10^{-5}$. i.e. a 5 digit accuracy.

Up to what I heard, we need about 7 digit of accuracy before we have a chance to feed this into existing Pluoffe's Inverter and find whether this number look like a combination of simple mathematical constants.

Until one can speed up the algorithm to have a $N > 10^{13}$ simulation or have a better control of the error terms. Simulation remains only useful for cross checking purposes.


Disclaimer: This is another long comment reporting on the attempt to attack the problem numerically.


Instead of parametrizing the 6 segments of the unit stick as $$(u_1,u_2-u_1,u_3-u_2,u_4-u_3,u_5-u_4,1-u_5)$$ where $(u_1,u_2,u_3,u_4,u_5)$ are order statistics of the uniform distribution with pdf $$ f_U(u_1,u_2,u_3,u_4,u_5) = 5! \cdot \left[0<u_1<u_2<u_3<u_4<u_5<1 \right] $$ I am using a different parametrization: $$ (1-w_1, w_1 (1-w_2), w_1 w_2 (1-w_3), w_1 w_2 w_3 (1-w_4), w_1 w_2 w_3 w_4 (1-w_5), w_1 w_2 w_3 w_4 w_5 ) $$ It is easy to solve for $\{w_k\}$ in terms of $\{u_k\}$: $$ w_1 = 1-u_1, w_2 = \frac{1-u_2}{1-u_1}, w_3 = \frac{1-u_3}{1-u_2}, w_4 = \frac{1-u_4}{1-u_3}, w_5 = \frac{1-u_5}{1-u_4} $$ which allows to find their induced measure: $$ f_W(w_1,w_2,w_3,w_4,w_5) = (5 w_1^4 [0<w_1<1]) \cdot (4 w_2^3 [0<w_2<1]) \cdot (3 w_3^2 [0<w_3<1] ) \cdot (2 w_4 [0<w_4<1]) \cdot ( [0<w_5<1] ) $$ meaning that $w_k$ are independent random variable with different power distributions on the unit interval.

This parametrization is friendlier to numerical integration routines.

We then proceed much like @achille-hui . Here is Mathematica code I ran:

TriangleInequalities[{a_, b_, c_}] := 
 a < b + c && b < a + c && c < a + b

FacialTetrahedron[{x_, y_, z_, xb_, yb_, zb_}] := 
 TriangleInequalities[{x, y, zb}] && 
  TriangleInequalities[{x, yb, z}] && 
  TriangleInequalities[{xb, y, z}] && 
  TriangleInequalities[{xb, yb, zb}]

TetrahedraSextupleQ[{x_, y_, z_, xb_, yb_, zb_}] := 
 FacialTetrahedron[{x, y, z, xb, yb, zb}] && 
  Det[{{0, x^2, y^2, z^2, 1}, {x^2, 0, zb^2, yb^2, 1}, {y^2, zb^2, 0, 
      xb^2, 1}, {z^2, yb^2, xb^2, 0, 1}, {1, 1, 1, 1, 0}}] > 0

We now build the event that one can form a tetrahedron out of 6 pieces the unit stick is divided into. The following takes a while to compute.

event2 = Assuming[
   0 < w1 < 1 && 0 < w2 < 1 && 0 < w3 < 1 && 0 < w4 < 1 && 0 < w5 < 1,
    Simplify[
    Apply[Or, 
     Simplify[TetrahedraSextupleQ[#]] & /@ 
      Permutations[{(1 - w1), 
        w1 (1 - w2), w1 w2 (1 - w3), w1 w2 w3 (1 - w4), 
           w1 w2 w3 w4 (1 - w5), w1 w2 w3 w4 w5 }]]]];

Therefore I saved the resulting predicate in paste-bin. Here is how to import it:

event2 = ToExpression[
   Import["http://pastebin.com/raw.php?i=399MDkGQ", "Text"], 
   InputForm];

We now define a compiled filter function to decide if a random vector fires the event.

cfFunc2 = With[{ee = event2},                                                   
           Compile[{{arg, _Real, 1}}, Block[{w1, w2, w3, w4, w5},               
             {w1, w2, w3, w4, w5} = arg;                                        
             If[ee, 1, 0]], RuntimeAttributes -> Listable]];

The function above allows to efficiently run the Monte-Carlo simulation. Here is the simulation that takes some 4.7hours:

In[3]:= Block[{sample, tot = 0, suc = 0},                                       
          While[suc <= 10^9,                                                    
           sample =                                                             
            RandomVariate[                                                      
             ProductDistribution[PowerDistribution[1, 5],                       
              PowerDistribution[1, 4], PowerDistribution[1, 3],                 
              PowerDistribution[1, 2], UniformDistribution[]], 3*10^8];         
           tot += Length[sample];                                               
           suc += Total[cfFunc2[sample]];                                       
           ];                                                                   
          {suc, tot}                                                            
          ] // AbsoluteTiming                                                   

Out[3]= {16994.098510, {1018403735, 15600000000}}

Entailing the following $(1-10^{-8})/2$-level confidence interval:

In[38]:= Block[{suc = 1018403735, tot = 15600000000}, 
 PlusMinus[N[suc/tot], 
  Sqrt[2.] InverseErfc[10^-8] Sqrt[
    With[{p = suc/tot}, p (1 - p)/tot]]]]

Out[38]= 0.0652823 \[PlusMinus] 0.0000113341

The advantage of the $w$-parametrization is that Cartesian quadrature rules can be applied. Using the fact that $w_k = 2^{-1/k}$ for $1 \leqslant k \leqslant 5$ furnishes a tetrahedral splitting:

In[160]:= event2 /. {w1 -> (1/2)^(1/5), w2 -> (1/2)^(1/4), 
  w3 -> (1/2)^(1/3), w4 -> (1/2)^(1/2), w5 -> 1/2`}

Out[160]= True

we can help numerical quadrature algorithm to find a sample point inside the region of interest. So with enough time at hand it should be possible to get a more precise quadrature answer:

AbsoluteTiming[
 prob = NIntegrate[
   f[w1, w2, w3, w4, w5], {w1, 0, (1/2)^(1/5), 1}, {w2, 
    0, (1/2)^(1/4), 1}, {w3, 0, (1/2)^(1/3), 1}, {w4, 0, (1/2)^(1/2), 
    1}, {w5, 0, 1/2, 1}, 
   Method -> {"CartesianRule", "SymbolicProcessing" -> 0}, 
   MaxRecursion -> 14]]

However, after some 20 hours, this integration command is still running... I will post the answer once the evaluation is complete.