Scipy Normaltest how is it used?

In [12]: import scipy.stats as stats

In [13]: x = stats.norm.rvs(size = 100)

In [14]: stats.normaltest(x)
Out[14]: (1.627533590094232, 0.44318552909231262)

normaltest returns a 2-tuple of the chi-squared statistic, and the associated p-value. Given the null hypothesis that x came from a normal distribution, the p-value represents the probability that a chi-squared statistic that large (or larger) would be seen.

If the p-val is very small, it means it is unlikely that the data came from a normal distribution. For example:

In [15]: y = stats.uniform.rvs(size = 100)

In [16]: stats.normaltest(y)
Out[16]: (31.487039026711866, 1.4543748291516241e-07)

First i found out that scipy.stats.normaltest is almost the same. The mstats library is used for masked arrays. Arrays where you can mark values as invalid and not taken into the calculation.

import numpy as np
import numpy.ma as ma
from scipy.stats import mstats

x = np.array([1, 2, 3, -1, 5, 7, 3]) #The array needs to be larger than 20, just an example
mx = ma.masked_array(x, mask=[0, 0, 0, 1, 0, 0, 0])
z,pval = mstats.normaltest(mx)

if(pval < 0.055):
    print "Not normal distribution"

"Traditionally, in statistics, you need a p-value of less than 0.05 to reject the null hypothesis." - http://mathforum.org/library/drmath/view/72065.html

Scipy Normaltest how is it used?

Related

Recent Posts