A sequence of order statistics from an iid sequence

stochastic-processes

Given a sequence of iid random variables $X_i$ (without loss of generality from $U(0,1)$), an integer $k \ge 1$ and some $p \in (0,1)$, construct the sequence of random vectors $Z^{(j)}$, $j=0,1,...$ in the following way. Let

$$Z^{(0)}=(X_{(1)},...,X_{(k)}),$$

where $X_{(l)}$ is the $l$-order statistic of sample $\{X_1,...,X_k\}$. Introduce notations

\begin{align} Z^{(j)}&=(Z_{j,1},...,Z_{j,k}),\\ m_j&=\min(Z_{j-1,1},...,Z_{j-1,k},X_{k+j}),\\ M_j&=\max(Z_{j-1,1},...,Z_{j-1,k},X_{k+j}) \end{align}

Then

$$Z^{(j)}=(Y_{(1)},...,Y_{(k)})$$

where $Y_{(l)}$ is the $l$-order statistic of the following set which is

The set $\{Z_{j-1,1},...,Z_{j-1,k},X_{k+j}\}\backslash m_j$ with probability $p$
The set $\{Z_{j-1,1},...,Z_{j-1,k},X_{k+j}\}\backslash M_j$ with probability $1-p$

The decision between cases 1. and 2. is made independently from the $X_i$ (and hence from the $Z^{(i)}$).

The $Z^{(j)}$ are supported on the $k$-dimensional simplex $S_k = \{(x_1, \dots, x_k) \in \mathbb{R}^k \, | \, 0 \le x_1 \le x_2 \le \dots \le x_k \le 1 \}$.

It appears that the $Z^{(j)}$ converge in distribution. Is this known? Is anything known about the limiting distribution?

For the case $k=1$, the answer is the following. Denote the cdf of $Z^{(j)}$ by $F_j$.

The cdf of $\min(X_{n+1},Z^{(n)})$ (for $U(0,1)$ case) is

$$x+F_n(x)−xF_n(x)$$ and the cdf of $\max(X_{n+1},Z^{(n)})$ is

$$xF_n(x)$$.

Hence

\begin{align} F_{n+1}(x)&=p(x+F_n(x)−xF_n(x))+(1−p)xF_n(x)\\ &=px+(p(1-x)+(1-p)x)F_n(x) \end{align}

Since $p(1-x)+(1-p)x\in(0,1)$ we have that

$$\lim F_{n}(x)=\frac{px}{1-p(1-x)-(1-p)x}$$

I am looking for general results (case $k>1$) either for the limiting distribution of the whole vector $Z^{(j)}$ or of some of its components (marginal distributions).

Solution 1:

EDIT I actually had a different answer for this question referring to markov chains, but then I saw a different and better way of solving the problem. So I changed my answer - can post the old one back if people want it

Now $Z^{(m)}$ will be a function of $X_1,\dots,X_{k+m-1}$. In particular it will be a function of the order statistics of this set.

For a uniformly distributed variable $X_i$, if we take a collection of $k+m-1$ of them, the rth order statistic has a beta distribution:

$$X_{(r)}\sim Beta(r,k+m-r)$$

Now what will the process have done by the mth step? It will have removed the top $n_t$ order statistics and the bottom $n_b$, with the constraint that $n_t+n_b=m-1$. Stated this way the probability of any such set occurring is given by the binomial distribution:

$$Pr(n_t)={m-1 \choose n_t}p^{n_t}(1-p)^{m-1-n_t}$$

Now given $n_t$ is the number chosen, we have $Z^{(m)}=(X_{(m-n_t)},\dots,X_{(k+m-1-n_t)})$ Now the joint distribution of $Z^{(m)}$ which looks like:

$$P(Z^{(m)}|n_t)=\frac{(k+m-1)!X_{(m-n_t)}^{m-1-n_t}(1-X_{(k+m-1-n_t)})^{n_t}}{(m-1-n_t)!(n_t)!}dX_{(m-n_t)}\dots dX_{(k+m-1-n_t)}$$

Which is a Dirichlet distribution with parameters $(m-n_t,1,1,\dots,1,n_t+1)$ with $k-2$ 1's in the middle (which is a general result for the joint distribution of order stats - not sure if its known).

Now we need to sum out $n_t$ with respect to its probability:

$$P(Z^{(m)})=\sum_{n_t=0}^{m-1}P(n_t)P(Z^{(m)}|n_t)$$ $$=\frac{(k+m-1)!}{(m-1)!}\sum_{n_t=0}^{m-1}{m-1 \choose n_t}^{2}\left[(1-X_{(k+m-1-n_t)})p\right]^{n_t}\left[X_{(m-n_t)}(1-p)\right]^{m-1-n_t}$$

Which looks tantalizingly close to a binomial distribution, if it wasn't for that annoying squared term in the summation. I don't know an exact solution for this summation. This is a far as I could get. You would likely require numerical work for this.

Solution 2:

I've asked this question in mathoverflow. It has one answer, which I do not find time to check, so I am giving the link as an answer here.

A sequence of order statistics from an iid sequence

Solution 1:

Solution 2:

Related

Recent Posts