Given IID normal random variables $X_1, \ldots, X_n$, show that $(X_1 - \bar{X})/S$ is ancillary.

This has been bugging me for the past day, and I just can't seem to figure it out.

Suppose $X_i \sim N(\mu, \sigma^2)$ for $i=1, \ldots, n$ are IID, where $\mu \in \mathbb{R}$ and $\sigma^2>0$ are unknown.

I want to show that $Z = \frac{X_1 - \bar{X}}{S}$ is an ancillary statistic (its distribution is independent of $\mu$ and $\sigma$). Basically, I need to calculate the distribution of $Z$.

I know that the formula for the density of $Z$ is given by (Shao, Mathematical Statistics, pg. 165 eq (3.1)): $$ f(z) = \frac{\sqrt{n}\Gamma(\frac{n-1}{2})}{\sqrt{\pi}(n-1)\Gamma(\frac{n-2}{2})} \bigg[ 1 - \frac{nz^2}{(n-1)^2}\bigg]^{(n/2) - 2} I_{(0, (n-1)/\sqrt{n})}(|z|) $$ but I have no idea how to derive this result. Presumably you use some transformation, but I just can't seem to find the right one.

I don't need a full solution, mostly just some help getting started. Thanks! Oh, and $\bar{X}$ and $S^2$ are the sample mean and sample variance, respectively.


I would like to point out the connection of $Z$ with a $t$-distribution, which is apparent from the relationship of $Z$ with a correlation coefficient discussed here.

I assume $S^2$ is defined as $$S^2=\frac1{n-1}\sum_{i=1}^n (X_i-\overline X)^2$$

Then for $n>2$, the following has a $t$-distribution with $n-2$ degrees of freedom:

$$T=\frac{\sqrt{\frac{n}{n-1}}(X_1-\overline X)}{\sqrt{\left\{(n-1)S^2-\frac{n}{n-1}(X_1-\overline X)^2\right\}/(n-2)}} \sim t_{n-2} \tag{1}$$

In terms of $Z$, we have

$$T=\frac{\sqrt{\frac{n(n-2)}{n-1}}Z}{\sqrt{n-1-\frac{nZ^2}{n-1}}}$$

Or,

$$Z = \pm \frac{(n-1)T}{\sqrt{n(T^2+n-2)}}$$

Since $T$ and $-T$ have the same distribution, $Z$ has the following distribution for $n>2$:

$$Z \sim \frac{(n-1)t_{n-2}}{\sqrt{n(t^2_{n-2}+n-2)}} \tag{2}$$


To prove $(1)$, we can transform $\boldsymbol X \mapsto \boldsymbol Y$ such that $\boldsymbol Y=P \boldsymbol X$, where $P$ is an orthogonal matrix with its first two rows fixed as:

$$P= \begin{bmatrix} \frac1{\sqrt n} &\frac1{\sqrt n} &\cdots &\frac1{\sqrt n}\\ \frac{n-1}{\sqrt{n(n-1)}} &\frac{-1}{\sqrt{n(n-1)}} &\cdots &\frac{-1}{\sqrt{n(n-1)}}\\ \vdots & \vdots &\cdots & \vdots \end{bmatrix}$$

This implies $Y_1=\sqrt n\overline X\,,\,Y_2=\sqrt{\frac{n}{n-1}}(X_1-\overline X)$ and

$$(n-1)S^2-\frac{n}{n-1}(X_1-\overline X)^2=\sum_{i=3}^n Y_i^2\,,$$

so that

$$T=\frac{Y_2}{\sqrt{\sum_{i=3}^n Y_i^2/(n-2)}}$$

Note that $Y_2,Y_3,\ldots,Y_n $ are i.i.d $N(0,\sigma^2)$, which completes the proof.


The following simulation for $n=3,4,5,10$ compares $(2)$ with the pdf of $Z$ in the original post:

enter image description here

R code for the individual plots above:

t=rt(1e5,n-2)
z=(n-1)*t/sqrt(n*(t^2+n-2))
hist(z,prob=TRUE,nclass=145,col="wheat")
c=(sqrt(n)*gamma((n-1)/2))/(sqrt(pi)*(n-1)*gamma((n-2)/2))
pdf=function(x){c*(1-n*x^2/(n-1)^2)^(n/2-2)*(abs(x)<(n-1)/sqrt(n))}
curve(pdf,add=TRUE,col="sienna",lwd=3)