The Harmonic Mean P-Value for Combining Dependent Tests. Wilson, D. J. 116(4):1195–1200.
The Harmonic Mean P-Value for Combining Dependent Tests [link]Paper  doi  abstract   bibtex   
[Significance] The widespread use of Bonferroni correction encumbers the scientific process and wastes opportunities for discovery presented by big data, because it discourages exploratory analyses by overpenalizing the total number of statistical tests performed. In this paper, I introduce the harmonic mean p-value (HMP), a simple to use and widely applicable alternative to Bonferroni correction motivated by Bayesian model averaging that greatly improves statistical power while maintaining control of the gold standard false positive rate. The HMP has a range of desirable properties and offers a different way to think about large-scale exploratory data analysis in classical statistics. [Abstract] Analysis of ” big data” frequently involves statistical comparison of millions of competing hypotheses to discover hidden processes underlying observed patterns of data, for example, in the search for genetic determinants of disease in genome-wide association studies (GWAS). Controlling the familywise error rate (FWER) is considered the strongest protection against false positives but makes it difficult to reach the multiple testing-corrected significance threshold. Here, I introduce the harmonic mean p-value (HMP), which controls the FWER while greatly improving statistical power by combining dependent tests using generalized central limit theorem. I show that the HMP effortlessly combines information to detect statistically significant signals among groups of individually nonsignificant hypotheses in examples of a human GWAS for neuroticism and a joint human–pathogen GWAS for hepatitis C viral load. The HMP simultaneously tests all ways to group hypotheses, allowing the smallest groups of hypotheses that retain significance to be sought. The power of the HMP to detect significant hypothesis groups is greater than the power of the Benjamini–Hochberg procedure to detect significant hypotheses, although the latter only controls the weaker false discovery rate (FDR). The HMP has broad implications for the analysis of large datasets, because it enhances the potential for scientific discovery.
@article{wilsonHarmonicMeanPvalue2019,
  title = {The Harmonic Mean P-Value for Combining Dependent Tests},
  author = {Wilson, Daniel J.},
  date = {2019-01},
  journaltitle = {Proceedings of the National Academy of Sciences},
  volume = {116},
  pages = {1195--1200},
  issn = {0027-8424},
  doi = {10.1073/pnas.1814092116},
  url = {https://doi.org/10.1073/pnas.1814092116},
  abstract = {[Significance] The widespread use of Bonferroni correction encumbers the scientific process and wastes opportunities for discovery presented by big data, because it discourages exploratory analyses by overpenalizing the total number of statistical tests performed. In this paper, I introduce the harmonic mean p-value (HMP), a simple to use and widely applicable alternative to Bonferroni correction motivated by Bayesian model averaging that greatly improves statistical power while maintaining control of the gold standard false positive rate. The HMP has a range of desirable properties and offers a different way to think about large-scale exploratory data analysis in classical statistics.

[Abstract] Analysis of ” big data” frequently involves statistical comparison of millions of competing hypotheses to discover hidden processes underlying observed patterns of data, for example, in the search for genetic determinants of disease in genome-wide association studies (GWAS). Controlling the familywise error rate (FWER) is considered the strongest protection against false positives but makes it difficult to reach the multiple testing-corrected significance threshold. Here, I introduce the harmonic mean p-value (HMP), which controls the FWER while greatly improving statistical power by combining dependent tests using generalized central limit theorem. I show that the HMP effortlessly combines information to detect statistically significant signals among groups of individually nonsignificant hypotheses in examples of a human GWAS for neuroticism and a joint human–pathogen GWAS for hepatitis C viral load. The HMP simultaneously tests all ways to group hypotheses, allowing the smallest groups of hypotheses that retain significance to be sought. The power of the HMP to detect significant hypothesis groups is greater than the power of the Benjamini–Hochberg procedure to detect significant hypotheses, although the latter only controls the weaker false discovery rate (FDR). The HMP has broad implications for the analysis of large datasets, because it enhances the potential for scientific discovery.},
  keywords = {*imported-from-citeulike-INRMM,~INRMM-MiD:c-14682333,aggregated-indices,multiplicity,p-value,predictor-selection,protocol-uncertainty,statistics,uncertainty-propagation},
  number = {4}
}

Downloads: 0