_Introduction to Hypothesis Testing

tags: #statistical_application #hypothesis_testing

What is a hypothesis?

A hypothesis is an assumption about a population parameter which we will either support or reject based on empirical evidence.

Hypothesis Testing

Hypothesis testing is a form of inferential statistic[1] where we make generalizations (inferences) about a population parameter based on a sample statistic by quantifying evidence against the null.

Standard Method: Null-Hypothesis Significance Testing (NHST)

Null-Hypothesis Significance Testing

The NHST combines Fisher's p-value to quantify evidence against the null with Neyman/Pearson's hypothesis in prescribing what to do with the measure in fixing whether to accept or reject the null.

Neyman-Pearson

Derived from the Neyman-Pearon's approach to hypothesis testing, the NHST considers two competing hypothesis:

Significance testing is tested under the null hence, quantifying evidence against the null.

Fisher's P-Value

Fisher's p-value is used to determine the statistical significance of a result.

P values correspond to the probability of observing a sample statistic that is at least as extreme as the observed statistic, assuming the null is true.

If observations are sufficiently unlikely from the POV of the null hypothesis, this should be treated as evidence against the null.


Overview: Conducting a Hypothesis Test

Pre-check: Assumptions and Conditions

Before conducting any statistical test, make sure all assumptions and conditions are satisfied. Otherwise, results will not be interpretable.

Step 1: Generate a hypothesis model

Guidelines for Generating a Hypothesis Model

  • Hypothesis is always expressed in terms of population parameters[2]
  • Null hypothesis is expressed as a statement of equality
  • Direction of the alternative hypothesis depends on the context of the question - i.e., whether you want to test whether the population parameter is different, greater, or smaller than your claim

Null Hypothesis (H0)

The Null Hypothesis is a statement about the point-value assumption about what the true population parameter value is.

Posits that the observed phenomenon is due to chance.

This is the assumption we want to "disprove", but is assumed correct unless there is evidence to oppose it.

Null hypothesis assumes no difference or change. Expressed as a statement of equality, such that:

H0:Parameter = Claim

Alternative Hypothesis (HA)

This is what you are testing for - that the true parameter does NOT equal to the null.

Posits that the null is not true and observed phenomenon is NOT due to chance.

Represents what is logically implied when the null is False.

Expressed in one of the 3 statements about the population parameter depending on the directionality of the test:

ParameterClaim, One-tail LowerParameterClaim, One-tail UpperParameterClaim, Two-tailed

Step 2: Alpha Significance Level

The alpha significance level is the threshold for determining whether the observed effect is statistically significant - i.e., did not occur by chance, by defining the Rejection (Critical) Regions.

This represents the maximum allowable probability of rejecting the null hypothesis when it is True, and specifies how strongly the sample evidence must contradict the null hypothesis before you can reject the null for the entire population.

The lower the significance level, the stronger evidence required before you will reject the null.

Should be set before the study - otherwise, leads to p-value hacking.

P-value Hacking

This is an exploitation of data analysis in order to discover patterns which would be presented as statistically significant, when in reality, there is no underlying effect.


Step 3: Testing The Hypothesis

Compute the test statistic

Hypothesis test are conducted by computing a test statistic of the sample.

What is a Test Statistic?

The test statistic is a numerical summary of the data used to assess evidence against the null hypothesis as a measure of how far the observed data deviate from what would be expected under the null hypothesis.

The appropriate test statistic and its corresponding Null Sampling Distribution used depends on:

  1. The type of data being dealt with; and
  2. Scope of the research question (i.e., context in which the study is being conducted in)

This will dictate the type of experimental design to be conducted to answer the research question.

Determine the p-value

P-values is a measure of statistical significance (i.e., whether the test statistic is statistically significant).

This represents the probability (fraction of times you would see) of observing a test statistic at least as extreme your statistic using the null sampling distribution.

When computing the p-value of the test statistic, we are computing the cumulative probability of the null sampling distribution of observing the test statistic smaller than or equal to your observed test statistic.

Note: The exact method of computing the p-value depends on the type of hypothesis test and the distribution of the test statistic under the null hypothesis.

⚠ Switch to EXCALIDRAW VIEW in the MORE OPTIONS menu of this document. ⚠

Text Elements

observed test statistic
P(x)
P-value of test statistic =
Cumulative Probability of
observing the test statistic


Step 4: Interpreting P-values Against the Alpha

See also: Interpreting P-values Against Alpha

Important to remember that regardless of the outcome of the p-value in the hypothesis test, the results are UNRELATED to the truth or falsity of the alternative hypothesis.



  1. INF 1344 Statistics - Lecture 1 ↩︎

  2. This is because we are interested in making inferences about the population parameter ↩︎

Powered by Forestry.md