Effect Size
tags: #statistics/inferential/ttest
Two common methods in measuring effect size for t-tests include:
- Cohen's d
- Pearson's r
Cohen's d
Cohens d is a standardized effect size for measuring the difference between two group means in determining the practical significance of two variables, should an effect be found given a significant p-value:
- Small effect = 0.2
- Medium Effect = 0.5
- Large Effect = 0.8
There are two ways to compute this:
- If the SD of the samples are different
- If the SD of the samples are the same
Computing Cohen's d in Python
There is no function in Python that can be used to compute Cohen's d directly; however, we can create a function that computes Cohen's d using the above formula:
from scipy import stats
import numpy as np
def cohens_d(group1, group2):
# compute mean differnce between two groups
mean_diff = np.mean(group1) - np.mean(group2)
# compute pooled sd
sd_pooled = np.sqrt((np.std(group1) ** 2 + np.std(group2) ** 2) /
(len(group1) + len(group2)-2)
d = mean_diff/sd_pooled
return d
Sample Code:
from scipy import stats
import numpy as np
# Example data
group1 = np.array([1, 2, 3, 4, 5])
group2 = np.array([3, 4, 5, 6, 7])
# Compute the mean difference between the groups
mean_diff = np.mean(group1) - np.mean(group2)
# Compute the pooled standard deviation
n1, n2 = len(group1), len(group2)
std1, std2 = np.std(group1, ddof=1), np.std(group2, ddof=1)
pooled_std = np.sqrt(((n1-1)*std1**2 + (n2-1)*std2**2) / (n1+n2-2))
# Compute Cohen's d
cohens_d = mean_diff / pooled_std
print("Cohen's d:", cohens_d)
Pearson's r
This summarises the strength of a bivariate relationship.
To compute Pearson's r for effect size in Python, you can use the pearsonr function from the scipy.stats module. This function takes two arrays of data as input and returns Pearson's correlation coefficient (r) and a p-value:
from scipy.stats import pearsonr
# Example data
x = [1, 2, 3, 4, 5] # variable 1
y = [2, 4, 6, 8, 10] # variable 2
# Compute Pearson's r and p-value
r, p_value = pearsonr(x, y)
# Print the result
print("Pearson's r =", r)
The resulting r value represents the strength of the linear relationship between the two variables, with values between -1 and 1.
| Effect size | Pearson's r |
|---|---|
| Small | .1 to .3 or -0.1 to -0.3 |
| Medium | .3 to .5 or -0.3 to -0.5 |
| Large |