Science Sleuths: the Science that Shapes Diagnostic Tests: What Does ‘Statistically Significant’ Actually Mean?
You’ve most likely heard or read the term “statistically significant” numerous times in your life. What does that actually mean and how do we determine if something is significant or not?
In the most basic form, statistically significant means something that is not due to random variability (not attributed to chance).
If we want to get technical, statistical significance is all about the determination of the null hypothesis. The null hypothesis is the hypothesis that there is no significant difference between specified populations, any observed difference is due to sampling or experimental error. By performing hypothesis testing, you get a result known as the p-value, which is the probability of observing extreme results in the data you have collected. A p-value of 5% or lower is typically considered to be statistically significant.
What does this mean for the veterinary and horse communities?
By measuring the relationship between multiple variables (i.e. new diet vs. standard diet, vaccine vs. no vaccine, etc.), this allows us to establish the likelihood that an outcome is caused by what we are studying instead of just randomly happening. This means we can determine if something is actually working better than leaving things alone. Nutritionists do this all the time when testing new rations; pharmaceutical companies do this when testing new drugs or vaccines. Veterinarians, and more likely research scientists, may use this to determine if a new type of surgery or expensive treatment is worthwhile.
How does it work?
While knowing how to perform these tests is important for researchers, from a practical standpoint remember two important factors: sampling error andprobability. There is always the possibility that differences you see when measuring a sample are just the result of random variability (“background noise”) or just dumb luck. This is sampling error. Probability is just that, the likelihood of something actually happening. The higher the probability of a specific event or outcome, the more likely it is to happen. However, remember that while you may have a high probability, you cannot guarantee certainty.
The use of a p-value of 5%, written as p < 0.05, the most commonly chosen value, means we are looking at a 5% likelihood of something happening by chance alone (i.e. a one in 20 chance of that being the result). That means that whatever we are looking at statistically, the results are 95% due to what we are testing, be that a new drug, vaccine, treatment or surgery.
The take home message
Once testing and analysis are complete, a p-value that is low indicates a statistically significant difference. However, that does not mean the difference will automatically be important or useful. For practical significance (i.e. noteworthy), we need to determine if the difference is large enough to actually be meaningful. A relatively large difference would be useful and practical. A small difference might not be worth the effort or cost for only a small impact. This can cause issues with regard to the interpretation of results and what decisions to make based on the data. We will be discussing these issues and concerns in future stories in this publication, so watch for our future articles.
Jackie Smith, MSc, PhD, MACE, Dipl AVES, is an epidemiologist based at the UK Veterinary Diagnostic Lab. Emma Adam, DVM, PhD, DACVIM, DACVS, based at UK’s Gluck Center and Veterinary Diagnostic Lab, is responsible for research and serves as veterinary industry liaison.