Pitfalls to Statistial Analysis

P-Hacking

"Fishing for significance", "Data dreging"

P-hacking is a statistical method that involves manipulating data or statistical analyses in order to obtain statistically significant results that may not be a true reflection of the data or research question being investigated.

P-hacking can include practices such as selectively choosing data or statistical tests, stopping data collection early, and re-analyzing data until a significant result is obtained. P-hacking can lead to false positive results, which can ultimately undermine the integrity of scientific research.

Data-Dependent Analysis

On the other hand, data-dependent analysis refers to the process of making decisions about the analysis based on the data that is being analyzed, rather than pre-specifying a set of analyses before the data is collected or obtained. This can lead to the "garden of forking paths" problem where multiple paths of analysis are taken, potentially leading to false positive results.

For example, if a researcher is analyzing a dataset and finds a significant result, they may continue to analyze the data to explore potential mechanisms or moderators of the effect, leading to multiple comparisons and potentially finding false positive results.

This can lead to problems with reproducibility as the analysis may not hold up when applied to new data or in different contexts.