As students of epidemiology and/or working in the field of public health we see “p-value” almost in every literature. And I have seen many people who know only one thing about p-value i.e. a p-value less than 0.05 is “statistically significant” and a p-value more than 0.05 is “not statistically significant”.
Today I want to discuss what p-value is and 3 “must to know” things about p-value for public health researchers.
What is p-value?
The letter p stands for probability. So p-value is a probability value which ranges from 0 to 1 (as all probabilities lie between 0 and 1). The next question which automatically comes to our mind is “probability of what.” It is the probability that the difference between the observed value and null value has occurred by chance, more precisely due to sampling variability. In other words, p-value expresses probability of any observed differences between groups are due to chances only.
Example: In a clinical trial, 50 patients were randomly assigned to receive intravenous nitrate and 45 were randomly assigned to the control group. At the end of follow up, three of the 50 patients given intravenous nitrate had died versus eight in the control group. The calculated odds ratio (OR) is 0.33 indicating there was 67% reduction in mortality using intravenous nitrate in comparison to control group. The p-value is 0.08.
A p-value 0.08 indicates that probability of having this odds ratio (OR=0.33) by chance is 8%. In other words, 8 out of 100 such trials would show 67% or more reduction in mortality using intravenous nitrate just by chance or due to sampling variability.
3 must to know things about p-value
- Typically p-value <0.05 is used as a decision point that means there is less than 5% probability of having the difference between observed OR and null value (OR=1) only due to sampling variability. It is then said “statistically significant.” On the other hand, p-value >0.05 indicates “statistically insignificant” findings.
- Use of this arbitrary cut-off point of p-value for interpreting the findings of epidemiological research is now strongly discouraged in recent literatures and textbooks. P-value was originally generated for industrial quality control purpose where one have to take a “no” or “no go” decision. But in health research, we do not deal with such straight decision. Rather we conduct epidemiological research to find out the risk factors/exposures/lifestyles which adversely affect human health.
- Putting exact p-value along with 95% confidence interval in the research paper will allow the readers to take more educated judgment on whether the observed value is by chance or not; aiding to more informed conclusion about the research findings
- Rothman KJ, Greenland S. Modern Epidemiology. Second Edition. Philadelphia: Lippincott Williams and Wilkins, 1998.
- Whitley E., Ball J., Statistics review 3: Hypothesis testing and P values. Critical Care. June 2002. Vol 6 No 3