Convert Z-score (Z-value, standard score) to p-value for normal distribution in Python

How does one convert a Z-score from the Z-distribution (standard normal distribution, Gaussian distribution) to a p-value? I have yet to find the magical function in Scipy's stats module to do this, but one must be there.


Solution 1:

I like the survival function (upper tail probability) of the normal distribution a bit better, because the function name is more informative:

p_values = scipy.stats.norm.sf(abs(z_scores)) #one-sided

p_values = scipy.stats.norm.sf(abs(z_scores))*2 #twosided

normal distribution "norm" is one of around 90 distributions in scipy.stats

norm.sf also calls the corresponding function in scipy.special as in gotgenes example

small advantage of survival function, sf: numerical precision should better for quantiles close to 1 than using the cdf

Solution 2:

I think the cumulative distribution function (cdf) is preferred to the survivor function. The survivor function is defined as 1-cdf, and may communicate improperly the assumptions the language model uses for directional percentiles. Also, the percentage point function (ppf) is the inverse of the cdf, which is very convenient.

>>> import scipy.stats as st
>>> st.norm.ppf(.95)
1.6448536269514722
>>> st.norm.cdf(1.64)
0.94949741652589625

Edit: A user requested an example for ''vectors'':

import numpy as np
vector = np.array([.925, .95, .975, .99])
p_values = [st.norm.ppf(v) for v in vector]
f_values = [st.norm.cdf(p) for p in p_values]

for p,f in zip(p_values, f_values):
 print(f'p: {p}, \tf: {f}')   

Yields:

p: 1.4395314709384563,  f: 0.925
p: 1.6448536269514722,  f: 0.95
p: 1.959963984540054,   f: 0.975
p: 2.3263478740408408,  f: 0.99

Solution 3:

Aha! I found it: scipy.special.ndtr! This also appears to be under scipy.stats.stats.zprob as well (which is just a pointer to ndtr).

Specifically, given a one-dimensional numpy.array instance z_scores, one can obtain the p-values as

p_values = 1 - scipy.special.ndtr(z_scores)

or alternatively

p_values = scipy.special.ndtr(-z_scores)

Solution 4:

Starting Python 3.8, the standard library provides the NormalDist object as part of the statistics module.

It can be used to apply the inverse cumulative distribution function (inv_cdf, also known as the quantile function or the percent-point function) and the cumulative distribution function (cdf):

NormalDist().inv_cdf(0.95)
# 1.6448536269514715
NormalDist().cdf(1.64)
# 0.9494974165258963