Derivation of the density function of student t-distribution from this big integral.

My lecturer posed a question where we derive the density function of the student t-distribution from the Chi-square and Standard normal distribution.

I worked on this question for days, and I am pretty sure the below integral is correct (Verified by others)

$$f_T(t)=\int_{-\infty}^\infty|x|2nx\times \frac{\frac{1}{2}^{\frac{n}{2}}}{\Gamma\left(\frac{n}{2}\right)}(x^2n)^{\frac{n}{2}-1}e^{-\frac{1}{2}x^2n}\frac{1}{\sqrt{2\pi}}e^{-\frac{(xt)^2}{2}}dx$$

where n is the degree of freedom of the t-distribution and $\Gamma$ is the gamma function from the Gamma distribution.

My goal is $$f_T(t)=\frac{\Gamma[(n+1)/2]}{\sqrt{n\pi}\Gamma(n/2)}\left(1+\frac{t^2}{n}\right)^{-(n+1)/2}$$

I was given 2 hints. To proceed, I need to do integration by parts first, then I should use the fact that the Gamma d.f integrates to 1.

From this point on, I am unsure, but I shall show you my steps.

We know that the d.f of the Gamma density with parameters $\alpha=\frac{n+1}{2} \lambda=\frac{1}{2}$ integrates to $1$, that is $\int_{0}^{\infty}g(t)dt= \int_{0}^{\infty}\frac{\frac{1}{2}^{\frac{n+1}{2}}}{\Gamma\left(\frac{n+1}{2}\right)}t^{\frac{n+1}{2}-1}e^{-\frac{1}{2}t}dt=1$

Let $t=x^2n$. Therefore, $dt=2xn\,dx$

We have $\int_{0}^{\infty}g(t)dt=\int_{0}^{\infty}g(x^2n)2xn\,dx=\int_{0}^{\infty}\frac{\frac{1}{2}^{\frac{n+1}{2}}}{\Gamma\left(\frac{n+1}{2}\right)}(x^2n)^{\frac{n+1}{2}-1}e^{-\frac{1}{2}(x^2n)}2xn\,dx=1$

This should be useful because I noticed the $\Gamma[(n+1)/2]$ in the end result.

Working on the big integral now...

$f_T(t)=\int_{-\infty}^\infty|x|2nx\times \frac{\frac{1}{2}^{\frac{n}{2}}}{\Gamma\left(\frac{n}{2}\right)}(x^2n)^{\frac{n}{2}-1}e^{-\frac{1}{2}x^2n}\frac{1}{\sqrt{2\pi}}e^{-\frac{(xt)^2}{2}}$

$=\frac{\Gamma[(n+1]/2)]}{\sqrt{n\pi}\Gamma(n/2)}\int_{-\infty}^\infty \frac{\sqrt{n\pi}\Gamma(n/2)}{\Gamma[(n+1]/2)]}|x|2nx\times \frac{\frac{1}{2}^{\frac{n}{2}}}{\Gamma\left(\frac{n}{2}\right)}(x^2n)^{\frac{n}{2}-1}e^{-\frac{1}{2}x^2n}\frac{1}{\sqrt{2\pi}}e^{-\frac{(xt)^2}{2}}dx$

After a load of manipulation,

$=\frac{\Gamma[(n+1]/2)]}{\sqrt{n\pi}\Gamma(n/2)}\int_{-\infty}^\infty 2nx \frac{(\frac{1}{2})^{\frac{n+1}{2}}}{\Gamma [(n+1)/2)] }(x^2n)^{\frac{n-1}{2}}e^{-\frac{1}{2}(x^2n)}\times2n|x|e^{-\frac{(xt)^2}{2}}dx$

Note that the first half is integrate to 1. Hence I do by parts.

But I still cannot get to my goal.

I had tried this question for at least 6 times already now. Can I get some help? I tried to type these out as neatly as I knew how.


Solution 1:

Let $Y$ be a chi-square random variable with $n$ degrees of freedom. Then the square-root of $Y$, $\sqrt Y\equiv \hat Y$ is distributed as a chi-distribution with $n$ degrees of freedom, which has density $$ f_{\hat Y}(\hat y) = \frac {2^{1-\frac n2}}{\Gamma\left(\frac {n}{2}\right)} \hat y^{n-1} \exp\Big \{{-\frac {\hat y^2}{2}} \Big\} \qquad [1]$$

Define $X \equiv \frac {1}{\sqrt n}\hat Y$. Then $ \frac {\partial \hat Y}{\partial X} = \sqrt n$, and by the change-of-variable formula we have that

$$ f_{X}(x) = f_{\hat Y}(\sqrt nx)\Big |\frac {\partial \hat Y}{\partial X} \Big| = \frac {2^{1-\frac n2}}{\Gamma\left(\frac {n}{2}\right)} (\sqrt nx)^{n-1} \exp\Big \{{-\frac {(\sqrt nx)^2}{2}} \Big\}\sqrt n $$

$$=\frac {2^{1-\frac n2}}{\Gamma\left(\frac {n}{2}\right)} n^{\frac n2}x^{n-1} \exp\Big \{{-\frac {n}{2}x^2} \Big\} \qquad [2]$$

Let $Z$ be a standard normal random variable, and define the student's -t r.v. by

$$T = \frac{Z}{\sqrt \frac Yn}= \frac ZX $$.

By the standard formula for the density function of the ratio of two independent random variables, $$f_T(t) = \int_{-\infty}^{\infty} |x|f_Z(xt)f_X(x)dx $$

But $f_X(x) = 0$ for the interval $[-\infty, 0]$ because $X$ is a non-negative r.v. So we can eliminate the absolute value, and reduce the integral to

$$f_T(t) = \int_{0}^{\infty} xf_Z(xt)f_X(x)dx $$

$$ = \int_{0}^{\infty} x \frac{1}{\sqrt{2\pi}}\exp \Big \{{-\frac{(xt)^2}{2}}\Big\}\frac {2^{1-\frac n2}}{\Gamma\left(\frac {n}{2}\right)} n^{\frac n2}x^{n-1} \exp\Big \{{-\frac {n}{2}x^2} \Big\}dx $$

$$ = \frac{1}{\sqrt{2\pi}}\frac {2^{1-\frac n2}}{\Gamma\left(\frac {n}{2}\right)} n^{\frac n2}\int_{0}^{\infty} x^n \exp \Big \{-\frac 12 (n+t^2) x^2\Big\} dx \qquad [3]$$

The integral is a Mellin trasform of $\exp \Big \{-\frac 12 (n+t^2) x^2\Big\}$. Its solution is $$\int_{0}^{\infty} x^n \exp \Big \{-\frac 12 (n+t^2) x^2\Big\} = (n+t^2)^{-\frac 12 (n+1)}\Gamma(n+1)D_{-(n+1)}(0) $$

$$ = \left(1+\frac {t^2}{n}\right)^{-\frac 12 (n+1)} n^{-\frac 12 (n+1)}\Gamma(n+1)D_{-(n+1)}(0) \qquad [4] $$

We have obtained the kernel of the t-distribution, and this is the most critical step. Now $D_{-(n+1)}(0)$ is (Whittaker's) parabolic cylinder function. The following relations hold (see Abramovitz & Stegun 19.3.7 & 19.3.5):

$$D_{-(n+1)}(0) = U(n+\frac 12,0) = \frac {\sqrt\pi}{2^{\frac 12(n+\frac 12)+\frac 14} \Gamma\Big (\frac 34 + \frac 12(n+\frac 12)\Big)} = \frac {\sqrt\pi}{2^{\frac n2+\frac 12} \Gamma\Big (\frac n2 + 1\Big)} \qquad [5]$$

Inserting $[5]$ into $[4]$ and everything back into $[3]$ we obtain

$$f_T(t) = \frac{1}{\sqrt{2\pi}}\frac {2^{1-\frac n2}}{\Gamma\left(\frac {n}{2}\right)} n^{\frac n2} n^{-\frac 12 (n+1)}\Gamma(n+1)\frac {\sqrt\pi}{2^{\frac n2+\frac 12} \Gamma\Big (\frac n2 + 1\Big)}\left(1+\frac {t^2}{n}\right)^{-\frac 12 (n+1)} \; [6] $$

Using the duplication formula for the gamma function, and simplifying, the parade of constants in eq. $[6]$ becomes $\frac{\Gamma[(n+1)/2]}{\sqrt{n\pi}\Gamma(n/2)}$ and we have obtained the density of the t-distribution.

ADDENDUM : Derivation without the use of the Mellin transform.

We have arrived at $eq. [3]$. If we want to avoid using the Mellin transform, and instead use the hint that we were given, we have to transform the integral into the Gamma distribution function. The limits of integration are correct, so we need to manipulate the integrand into becoming a Gamma density function without changing the limits. Define the variable $$m \equiv x^2 \Rightarrow dm = 2xdx \Rightarrow dx = \frac {dm}{2x}, \; x = m^{\frac 12}$$ Making the substitution in the integrand we have

$$I_3=\int_{0}^{\infty} x^n \exp \Big \{-\frac 12 (n+t^2) m\Big\} \frac {dm}{2x} = \frac 12\int_{0}^{\infty} m^{\frac {n-1}{2}} \exp \Big \{-\frac 12 (n+t^2) m\Big \} dm \qquad [7]$$

The Gamma density can be written

$$ Gamma(m;k,\theta) = \frac {m^{k-1} \exp\Big\{-\frac{m}{\theta}\Big \}}{\theta^k\Gamma(k)}$$

Matching coefficients, we must have

$$k-1 = \frac {n-1}{2} \Rightarrow k^* = \frac {n+1}{2}, \qquad \frac 1\theta =\frac 12 (n+t^2) \Rightarrow \theta^* = \frac 2 {(n+t^2)} $$

For these values of $k^*$ and $\theta^*$ the terms in the integrand involving the variable are the kernel of a gamma density. So if we divide the integrand by $(\theta^*)^{k^*}\Gamma(k^*)$ and multiply outside the intergal by the same magnitude, the integral will be the gamma dist. function and will equal unity. Therefore we have arrived at

$$I_3 = \frac12(\theta^*)^{k^*}\Gamma(k^*) = \frac12 \Big (\frac 2 {n+t^2}\Big ) ^{\frac {n+1}{2}}\Gamma\left(\frac {n+1}{2}\right) = 2^ {\frac {n-1}{2}}n^{-\frac {n+1}{2}}\Gamma\left(\frac {n+1}{2}\right)\left(1+\frac {t^2}{n}\right)^{-\frac 12 (n+1)} $$

We have again obtained the kernel of the t-distribution. Inserting the above into eq. $[3]$ we get $$f_T(t) = \frac{1}{\sqrt{2\pi}}\frac {2^{1-\frac n2}}{\Gamma\left(\frac {n}{2}\right)} n^{\frac n2}2^ {\frac {n-1}{2}}n^{-\frac {n+1}{2}}\Gamma\left(\frac {n+1}{2}\right)\left(1+\frac {t^2}{n}\right)^{-\frac 12 (n+1)}$$

$$=\frac{\Gamma[(n+1)/2]}{\sqrt{n\pi}\Gamma(n/2)}\left(1+\frac {t^2}{n}\right)^{-\frac 12 (n+1)}$$

...Sometimes, more "primitive" methods are more straightforward.

Solution 2:

Following Casella and Berger Statistical Inference derivation:

If the standard deviation of the population, $\sigma$, is unknown we can replace it by the estimation based on a sample, $S$, but then the expression of the one-sample t-test statistic will follow a $t$-distribution:

$$ T_{\text{test}}=\frac{\bar{X}\,-\,\mu}{S/\sqrt{n}}\sim t_{n-1}$$

with $$S=\sqrt{\frac{\sum(X_i-\bar X)^2}{n-1}}.$$

Minimal manipulations of this equation for $T_{\text{test}}$

$$\begin{align} \frac{\bar{X}\,-\,\mu}{S/\sqrt{n}} &= \frac{\bar{X}\,-\,\mu}{\frac{\sigma}{\sqrt{n}}} \frac{1}{\frac{S}{\sigma}}\\[2ex] &= Z\,\frac{1}{\frac{S}{\sigma}}\\[2ex] &= \frac{Z}{\sqrt{\frac{\sum(X_i-\bar X)^2}{(n-1)\,\sigma^2}}} \sim\frac{Z}{\sqrt{\frac{\chi_{n-1}^2}{n-1}}} \sim t_{n-1} \end{align}$$

In the above expression,

$Z=\frac{(\bar{X}-\mu)}{\sigma/\sqrt{n}}\,\,\sim \,\,\small N(0,1)$

and as shown here:

$\sqrt{\frac{S^2}{\sigma^2}} =\large\sqrt{\frac{\frac{\sum_{x=1}^n(X - \bar{X})^2}{n-1}}{\sigma^2}} \,\,\sim \,\,\sqrt{\frac{\chi_{(n-1)}^2}{n-1}}$.


The derivation of the pdf of the Student's t distribution with $n$ degrees of freedom (not $n-1$ as above) proceeds by solving a simplified problem: finding the distribution of $U/{\sqrt{V/n}}$ with $U\sim N(0,1),$ and $V\sim\chi^2_n,$ mutually independent. The expression (1) is then replaced with the following expression ($k$ is used from this point on instead of $n$ for degrees of freedom) with the claim:

$$T=\frac{U}{\sqrt{\frac{V}{k}}}\sim t_k$$

With the premise of independence, the joint density of $U$ and $V$ is:

$f_{U,V}(u,v) = \underbrace{\frac{1}{(2\pi)^{1/2}} e^{-u^2/2}}_{\text{pdf } N(0,1)}\quad \underbrace{\frac{1}{\Gamma(\frac{k}{2})\,2^{k/2}}\,v^{(k/2)-1}\, e^{-v/2}}_{\text{pdf }\chi^2_k}$ with $-\infty<u<\infty$ and $0<v<\infty$.

Making the transformation $t=\frac{u}{\sqrt{v/k}}$ and $w=v$, hence, $u=t\,\left(\frac{w}{k}\right)^{1/2},$ and with $(w/k)^{1/2}$ as the Jacobian, the marginal pdf will be:

$$\begin{align} f_T(t) &= \displaystyle\int_0^\infty \,f_{U,V}\bigg(t\,(\frac{w}{k})^{1/2},w\bigg)(w/k)^{1/2}\,\mathrm{d} w\\[2ex] &= \frac{1}{(2\pi)^{1/2}}\frac{1}{\Gamma(\frac{k}{2})2^{k/2}}\, \int_0^\infty\, e^{-\frac{\left(t(\frac{w}{k})^{1/2}\right)^2}{2}} w^{(k/2)-1} e^{-(\frac{w}{2})} \frac{w^{1/2}}{k^{1/2}}\,\mathrm{d}w\\[2ex] &= \frac{1}{(2\pi)^{1/2}}\frac{1}{\Gamma(\frac{k}{2})2^{k/2}k^{1/2}}\, \displaystyle\int_0^\infty\, w^{((k+1)/2)-1}\,e^{-(1/2)(1 + t^2/k)w}\,\mathrm{d}w \end{align}$$

The next step entails identifying in the previous equation the kernel of a gamma distribution pdf:

$x^{\alpha-1}\,e^{x\,\lambda}$

with parameters $\left(\alpha=(k+1)/2,\,\lambda=(1/2)(1+t^2/k)\right).$

The generic pdf for the gamma distribution is,

$\large \frac{\lambda^\alpha}{\Gamma(\alpha)}\,x^{\alpha-1}\,e^{x\,\lambda}$.

The strategy is then to synthesize the entire gamma pdf within the improper integral in our $f_T(t)$ pdf in progress, so that we can simplify it as just $1$, as we know to be true of all pdf's. To get away with it we need to multiply numerator and denominator by the same coefficient:

$\frac{\Gamma(\alpha)\,\lambda^\alpha}{\Gamma(\alpha)\,\lambda^\alpha}$. And since neither $\alpha$ nor $\lambda$ include the integrating factor $w$ we can include them inside the integral, or leave them out. Naturally, we want to leave within the integral $\frac{\lambda^\alpha}{\Gamma(\alpha)}$, and keep $\frac{\Gamma(\alpha)}{\lambda^\alpha}$ outside the integral. Now $f_T(t)$ will look hideous for just one second:

$f_T(t)=\frac{1}{(2\pi)^{1/2}}\frac{1}{\Gamma(\frac{k}{2})2^{k/2}k^{1/2}}\, \int_0^\infty\frac{\left((1/2)(1+t^2/k)\right)^{(k+1)/2}}{\Gamma((k+1)/2)} w^{((k+1)/2)-1} e^{-(1/2)(1 + t^2/k)w} \mathrm{d}w\; \frac{\Gamma((k+1)/2)}{((1/2)(1+t^2/k))^{(k+1)/2}}$

... because everything between $\int$ and $\mathrm{d}w$ is just the gamma $\text{pdf}$ integrated over its entire support, so it becomes $1$, and we are left with:

$$\begin{align} f_T(t)&= \frac{1}{(2\pi)^{1/2}}\frac{1}{\Gamma\left(\frac{k}{2}\right)2^{k/2}k^{1/2}}\, \frac{\Gamma\left((k+1)/2\right)}{\left((1/2)(1+t^2/k)\right)^{(k+1)/2}}\\[2ex] &=\frac{1}{(2\pi)^{1/2}}\frac{1}{\Gamma\left(\frac{k}{2}\right)\,2^{k/2}k^{1/2}}\,\Gamma\left((k+1)/2\right)\, \Big[\frac{2}{(1+t^2/k)}\Big]^{(k+1)/2}\\[2ex] &= \frac{\Gamma\left(\frac{k+1}{2}\right)}{\Gamma\left(\frac{k}{2}\right)}\, \frac{1}{(2\pi)^{1/2}2^{k/2}k^{1/2}}\, \Big[\frac{2}{(1+t^2/k)}\Big]^{(k+1)/2}\\[2ex] &=\frac{\Gamma\left(\frac{k+1}{2}\right)}{\Gamma\left(\frac{k}{2}\right)}\, \frac{1}{(2\pi)^{1/2}2^{k/2}k^{1/2}}\, \frac{2^{(k+1)/2}}{(1+t^2/k)^{(k+1)/2}}\\[2ex] &= \frac{\Gamma\left(\frac{k+1}{2}\right)}{\Gamma\left(\frac{k}{2}\right)}\, \frac{1}{(\pi)^{1/2}k^{1/2}}\, \frac{1}{(1+t^2/k)^{(k+1)/2}}\\[2ex] &=\frac{\Gamma(\frac{k+1}{2})}{\Gamma(\frac{k}{2})}\, \frac{1}{\sqrt{k\,\pi}}\, \left(1+\frac{t^2}{k}\right)^{-\frac{k+1}{2}} \end{align}$$

which is the $\text{pdf}$ of the $t$-Student or Gosset distribution with $k$ degrees of freedom (or $n$ degrees of freedom). Here is the Wikpedia expression:

enter image description here

where $\nu$ are the degrees of freedom.

Solution 3:

Given your current integral state, the first thing I would have done is to do a substitution to get rid of the dependence of $t$ in the exponent, so $u = \frac{1}{2} x^2 (n+t^2)$. The result will be a Gamma integral and the $t$ dependent parts match with your desired solution.

Oh, and before the substitution, also make the integral from $0$ to $\infty$ and double the value (pesky $|x|$). Let me know where this takes you.


PS your integral needs a slight adjustment.

Let $Z$ be standard normal and $V$ be the square root of the chi-squared distribution with $n$ degrees of freedom.

Then in the joint distribution, to figure out the density at $t$, when you integrate over possible values of $Z=x$, note that as the chi-squared distribution is positive, this is only possible when $x \cdot t > 0$ (same sign), in which case you would plug in the density of $V$ at $n(x/t)^2$, as your desired student t distribution $Y$ is $Y = Z / \sqrt{(V/n)}$.

So for $t > 0$ (it's symmetric, so let's ignore the other case for now)

$f_T(t) = \int_0^\infty \frac{ (nx^2/t^2)^{k/2-1} e^{-\frac{nx^2}{2t^2}} }{ 2^{n/2} \Gamma(n/2) } \frac{e^{-x^2/2}}{\sqrt{2\pi}} dx$.

If $t>0$, the only contribution comes from when $Z=x>0$ and $V = n(x/t)^2$.

The first thing from here to do is to isolate the exponent, use the substitution $u = x^2 \frac{1+n/t^2}{2}$ and you should get a gamma function integral.