Effect Size

Two common methods in measuring effect size for t-tests include:

Cohen's d
Pearson's r

Cohen's d

Cohens d is a standardized effect size for measuring the difference between two group means in determining the practical significance of two variables, should an effect be found given a significant p-value:

Rule of Thumb: Interpreting Cohen's d

Small effect = 0.2
Medium Effect = 0.5
Large Effect = 0.8

There are two ways to compute this:

If the SD of the samples are different

C o h e n^{'} s d = \frac{m e a n_{A} - m e a n_{B}}{S D_{p o o l e d}}, where S D_{p o o l e d} = \sqrt{\frac{S D_{1}^{2} + S D_{2}^{2}}{n_{1} + n_{2} - 2}}

If the SD of the samples are the same

C o h e n^{'} s d = \frac{m e a n_{A} - m e a n_{B}}{S D}

Computing Cohen's d in Python

There is no function in Python that can be used to compute Cohen's d directly; however, we can create a function that computes Cohen's d using the above formula:

from scipy import stats
import numpy as np

def cohens_d(group1, group2):
	# compute mean differnce between two groups
	mean_diff = np.mean(group1) - np.mean(group2)

	# compute pooled sd
	sd_pooled = np.sqrt((np.std(group1) ** 2 + np.std(group2) ** 2) / 
	(len(group1) + len(group2)-2)
	
	d = mean_diff/sd_pooled
	
	return d

Sample Code:

from scipy import stats
import numpy as np

# Example data
group1 = np.array([1, 2, 3, 4, 5])
group2 = np.array([3, 4, 5, 6, 7])

# Compute the mean difference between the groups
mean_diff = np.mean(group1) - np.mean(group2)

# Compute the pooled standard deviation
n1, n2 = len(group1), len(group2)
std1, std2 = np.std(group1, ddof=1), np.std(group2, ddof=1)
pooled_std = np.sqrt(((n1-1)*std1**2 + (n2-1)*std2**2) / (n1+n2-2))

# Compute Cohen's d
cohens_d = mean_diff / pooled_std

print("Cohen's d:", cohens_d)

Pearson's r

This summarises the strength of a bivariate relationship.

To compute Pearson's r for effect size in Python, you can use the pearsonr function from the scipy.stats module. This function takes two arrays of data as input and returns Pearson's correlation coefficient (r) and a p-value:

from scipy.stats import pearsonr

# Example data
x = [1, 2, 3, 4, 5] # variable 1
y = [2, 4, 6, 8, 10] # variable 2

# Compute Pearson's r and p-value
r, p_value = pearsonr(x, y)

# Print the result
print("Pearson's r =", r)

The resulting r value represents the strength of the linear relationship between the two variables, with values between -1 and 1.

Effect size	Pearson's r
Small	.1 to .3 or -0.1 to -0.3
Medium	.3 to .5 or -0.3 to -0.5
Large	$\geq 0.5$ or $\leq - 0.5$