Calculating percentile value from mean and standard deviation of a normal distribution

To answer the question from your title "Calculating percentile value from mean and standard deviation of a normal distribution":

In practice one can do that (i.e. computing the normal cumulative distribution function $\Phi$) by converting the raw value to a Z-score (subtract the mean, then divide by std-dev) and then using a lookup table (sometimes called a Z-table) to convert the Z-score to percentile (well, to probability, for percentile multiply that by 100). Wikipedia has both the table(s) and examples how to use them.

If one needs more precision than a lookup table would provide there are some numerical algorithms that can compute that. The one in R's pnorm is based on

Cody, W. D. (1993) Algorithm 715: SPECFUN – A portable FORTRAN package of special function routines and test drivers. ACM Transactions on Mathematical Software 19, 22–32.

There are numerous others by relying on the simple transformation from $\Phi$ to the error function (erf), for which one can find numerous approximations. The paper by Soranzo and Epure (see next section) also gives an approximation formula directly as $$ \Phi(x) \approx 2^{-22^{1-41^{x/10}}} $$

Or more legible: 2**(-22**(1-41**(x/10))). Note this relies on the symmetry $\Phi(-x) = 1-\Phi(x)$ to extend it over negative arguments while preserving low error.

In the body of your question you are asking the opposite problem: "is it possible to determine the x value of the 95th percentile?" That's possible too, in general that's called the inverse cumulative cumulative or more succinctly quantile function, but for the normal distribution that function is just called probit, so that's the shortest word-like name for $\Phi^{-1}$. In R probit is implemented in qnorm. The numerical implementation of that in R is based on

Wichura, M. J. (1988) Algorithm AS 241: The percentage points of the normal distribution. Applied Statistics, 37, 477–484.

Besides that, the probit has a simple algebraic formula that relates it to the inverse error function. And there are some approximation formulas for the latter as well, e.g.

$$\operatorname{erf}^{-1}(x) \approx \operatorname{sgn}(x) \sqrt{ \sqrt{\left(\frac{2}{\pi a} + \frac{\ln(1 - x^2)}{2}\right)^2 - \frac{\ln(1 - x^2)}{a}} - \left(\frac{2}{\pi a} + \frac{\ln(1 - x^2)}{2}\right) }. $$ where

$$ a = \frac{8(\pi - 3)}{3\pi(4 - \pi)} \approx 0.140012.$$

Then:

$$\operatorname{probit}(p) = \sqrt{2}\,\operatorname{erf}^{-1}(2p-1).$$

If it needs spelling out, probit will give you the z-score from the probability $p$ (percentile divided by 100). To convert the z-score to your "x" you need to then apply the opposite of the z-score transformation, i.e. multiply by std-dev and then add the mean.

If you don't care much about accuracy, you can go old school and approximate the probit by logit, e.g. compute it as

$$\operatorname{probit}(p) \approx \sqrt{\frac{\pi}{8}}\,\ \ln\left( \frac{p}{1-p} \right).$$

The latter approximation gets pretty bad as $p$ gets high or low (i.e. it's best around 0.5).

Another good approximation for probit from a recent paper by Soranzo and Epure (2014) is

$$\operatorname{probit}(p) \approx \frac{10}{\ln 41}\, \ln \left(1- \frac{\ln \frac{-\ln p}{\ln 2}}{\ln 22} \right) $$

This has low error for $p \ge 0.5$, but one can use the symmetry $ \operatorname{probit}(1-p) = -\operatorname{probit}(p) $ for $p$ below 0.5.

You can find the percentile by taking the integral of the PDF (probability density function) from negative infinity to your target value (for the right-hand value).

Here is the PDF function for a standard distribution:

$$\frac{1}{\sqrt{2\pi\sigma^2}}\, e^{-\frac{(x - \mu)^2}{2 \sigma^2}}$$

So to find the percentile of 3 sigma on a standard normal distribution ($\sigma = 1$, $\mu=0$) you can solve the following integral: $$\int_{-\infty}^{3} \frac{1}{\sqrt{2\pi1^2}}\, e^{-\frac{(x - 1)^2}{(2) (1^2)}} = 0.99865...$$

More information can be found on the Wikipedia page.

For a normal distribution, if you have the mean and the standard deviation, then you can use the following to find the percentiles:

mean= 50th percentile

mean + sd = 84th percentile

mean +2sd = 97.5th percentile

Hope this helps!

P.S.: You could read about this more in detail here https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule

Calculating percentile value from mean and standard deviation of a normal distribution

Related

Recent Posts