Is there a distribution over all distributions in [0,1]?

Solution 1:

Here is a structured form of my above comment: We sequentially build a CDF $F(x)$ by choosing its values on rationals.

We seek CDFs $F(x)$ such that $F(x)=0$ for $x<0$ and $F(x)=1$ for $x\geq 1$. We shall specify $F(q)$ for all rational numbers $q \in [0,1]$. Let $\{q_0=1, q_1=0, q_2, q_3, q_4, ...\}$ be a listing of all rationals in $[0,1]$ (starting the list with $1$ and then $0$).

  1. Choose $X_0=1$ and define $F(1)=X_0=1$.

  2. Choose $X_1\sim U[0,1]$ and define $F(0)=X_1$.

  3. To enforce monotonicity, for each $i \in \{2, 3, 4, ...\}$ define the finite list $$A_i = \{q_0=1, q_1=0, q_2, ..., q_{i-1}\}$$ Since $0\in A_i$, at least one rational in $A_i$ is smaller than $q_i$. Define $l_i$ as the index of the largest rational in $A_i$ that is smaller than $q_i$. Similarly, since $1 \in A_i$, we can define $r_i$ as the index of the smallest rational in $A_i$ that is larger than $q_i$. Thus: $$ q_{l_i} < q_i < q_{r_i} \quad \forall i \in \{2, 3, 4, ...\}$$ Define the lower and upper bounds: $$ L_i = X_{l_i}, U_i = X_{r_i}$$ If $L_i=U_i$ define $X_i=L_i$ and $F(q_i)=X_i$. If $L_i<U_i$ then independently choose $X_i \sim U[L_i, U_i]$ and define $F(q_i)=X_i$.


Here we have built random values for $F(q_i)$ for all $i \in \{0, 1, 2, ...\}$. By construction the values satisfy monotonicity and satisfy $0\leq F(0)\leq F(1)=1$. For each $i$, with probability 1, right-continuity holds at $q_i$, that is $F(q_j)\rightarrow F(q_i)$ over any sequence of rationals $\{q_j\}_{j \in J}$ that converge to $q_i$ from the right. So this property holds for all $q_i$ with prob 1. Throw away any sample path for which right-continuity does not hold and try again (this happens with prob 0). Now define $F(x)$ for irrational $x \in [0,1]$ by taking right-limits over rationals (the right-limits exist by monotonicity). The result is a valid CDF.

You can use the distance between two CDFs $F(x)$ and $G(x)$ by $$d(F,G)= \sum_{i=0}^{\infty}2^{-(i+1)}|F(q_i)-G(q_i)|$$ Fix $\epsilon>0$. Fix any CDF $G(x)$ that satisfies $G(x)=0$ if $x<0$ and $G(x)=1$ if $x\geq 1$. If we randomly choose $F(x)$ according to the above procedure then $$P[d(F,G)<\epsilon]>0$$ To see this, fix a positive integer $n\geq 3$ such that $\sum_{i=n+1}^{\infty} 2^{-(i+1)} < \epsilon/2$. Then $$ \{d(F,G)<\epsilon\} \supseteq \bigcap_{i=1}^n \{|X_i-G(q_i)|<\epsilon/2\} $$