Evidence-based fitness research refers to the systematic, scientific investigation of fitness-related questions. The “evidence-based” model is the standard for ranking the quality of findings provided by medical studies. Level 1 evidence is considered strong and compelling. Level 5 is very weak. The level is determined not by any particular finding but by the strength of a study’s design, implementation, and reporting. Unlike people, all studies are not created equal. Some conclusions can be discounted entirely based on poor study design. Overall, the veracity and value of a study’s conclusion is determined by the level of evidence provided and its relevance for making actual decisions.
“Evidence-based medicine is the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients.”
From Sackett DL, Rosenberg WMC, Gray JAM, Haynes RB, Richardson WS. 1996. Evidence based medicine: what it is and what it isn’t. BMJ 312: 71–2 
The term “evidence-based medicine” first appeared in 1992 when it was used by Gordon Guyatt in an article published in JAMA. While the article focused on a “new approach to teaching the practice of medicine”, it represents the evolution of principles dating back to ancient Greece and the very beginnings of modern medicine. Put simply, it means that decisions should be based on convincing data. In its current form it is widely accepted as the gold standard for evaluating the safety and effectiveness of any proposed treatment. Using the best available information to inform decisions, evidence-based practice is dependent on ranking evidence produced by research. Randomized, controlled trials (RCTs) and systematic reviews of RCTs provide the highest quality evidence, while untested opinions provide virtually none.
How are levels of evidence determined?
Here is the pyramidal breakdown:
The Level of Evidence is ranked 1-5 depending on the source.
- RCTs and Systematic Reviews of RCTs
- Cohort and outcome studies
- Case control studies
- Case series
- Expert opinion
The most compelling data is derived from randomized, placebo-controlled, prospective trials with adequate power. Randomized means that study subjects are divided into separate groups by a chance mechanism, as opposed to selection based on preexisting characteristics. This reduces a type of bias whereby groups start out with key differences so they respond differently to treatment.
Placebo-controlled means that a subgroup is created that does not receive the treatment under investigation. Placebo groups allow us to determine whether effects can be explained by chance or natural progression as opposed to the influence of treatment. Prospective means that a study tracks subjects moving forward in time.
Prospective designs facilitate accurate data collection and identification of confounding factors which may be difficult using old information. Adequately powered means that statistical models are used in advance to determine the number of subjects needed to prevent wrong conclusions based on too-small sample sizes.
The highest quality evidence is derived by combining multiple RCTs that were designed to answer the same question. The statistical analysis of combined data can be much stronger than any single study and it is reported by meta-analysis or systematic review.
The above is a very brief overview of the complexities involved in assessing the quality of a particular study or body of evidence. To learn more, the Cochrane Collaboration is an invaluable resource.
Using evidence-based fitness research
It would obviously be ideal if personal fitness decisions could always be based on Level 1 evidence, but we don’t have this luxury. Many important questions lack definitive answers due to a scarcity of well-designed studies. RCTs take a long time and cost a lot of money. Questions that are important to people wanting to improve their personal fitness may receive less attention than potential blockbuster drugs or biomedical devices. Where is the financial gain in answering a question like “How many sets and reps are optimal for muscle hypertrophy at age 40?” Even questions with commercial implications often go unanswered, like “Am I risking death by rhabdomyolysis if I do Crossfit?” To answer many such questions we can go with our gut, read opinions from message boards, listen to the “experts,” or conduct experiments on ourselves—but we can’t always rely on the best quality evidence.
Quality research is being done. It is often sensationalized, buried, or ignored by large media outlets, but important and under-recognized research is available if we look for it. Our job is to look beyond headlines created to attract attention without critical analysis. Armed with a working understanding of how to critique research findings we can scrutinize advice and decide on fitness truth ourselves. Whether the advice comes from a friend at the gym, a best-selling book, or unquestioned tradition, we should know the strength of the supporting evidence before we accept it, reject it, or withhold judgment.