Are you using your p-values wrong?


As an applied biostatistician, I have worked with many researchers who like to explore many ways to get something “statistically significant” (p<0.05).  They tend to report all the results with p value <0.05 and pay less attention on the clinical significances. So, what do p values really mean? What can they tell us? Are results that cross the threshold of “statistical significance” (p<0.05) really all that significant (in a non-statistical sense)?

Now, the American Statistical Association (ASA) has released a “Statement on Statistical Significance and P-Values” with six principles underlying the proper use and interpretation of the p-value. The ASA releases this guidance on p-values to improve the conduct and interpretation of quantitative science and inform the growing emphasis on reproducibility of science research. The statement also notes that the increased quantification of scientific research and a proliferation of large, complex data sets has expanded the scope for statistics and the importance of appropriately chosen techniques, properly conducted analyses, and correct interpretation. 

The six principles, many of which address misconceptions and misuse of the p-value, are as follows:

  1. P-values can indicate how incompatible the data are with a specified statistical model. 
  2. P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
  3. Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold. 
  4. Proper inference requires full reporting and transparency. 
  5. A p-value, or statistical significance, does not measure the size of an effect or the importance of a result. 

By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis. According to the ASA statement: “Informally, a p-value is the probability under a specified statistical model that a statistical summary of the data (for example, the sample mean difference between two compared groups) would be equal to or more extreme than its observed value.”

In light of misuses of and misconceptions concerning p-values, statisticians should supplement p-values with other approaches. P-value is only one of many criteria for statistical inference. There are other methods that emphasize estimation over testing such as confidence, credibility, or prediction intervals; Bayesian methods; alternative measures of evidence such as likelihood ratios or Bayes factors; and other approaches such as decision-theoretic modeling and false discovery rates. The scientific community should apply the six principles of p-values to make better inference from their collected data. 

Previous Title of Previous Blog Post Goes Here Next Title of Next Blog Post Goes Here

Connect With Us


Stay connected to Children’s Health!

Please enter a valid email.