Shapiro-Wilk Test

tags: #statistics/inferential/assumption_check

H0:The data is normally distributedHA:The data is not normally distributed

We can test the assumption of normality for of the DV each independent group using the stats.shapiro() function:

from scipy.stats import shapiro

To run:

# note: before running shapiro, create subsets for each IV group

# get list of unique classes in a categorical variable
subgroup = df[df['categorical_var'] == group]['dv']

# run shapiro-wilk test
shapiro(subgroup)
This returns a tuple of (Wilk statistic, p-value)

We can be confident that the samples follow a normal distribution if p-value > 0.05; otherwise, if the p-value < 0.05, it suggests that the data deviates from the normal distribution.


Powered by Forestry.md