You have heard this before: Animal research that seemed so promising in animal therapies flopped when tried in humans. Sure, animal models are limited, and there are very distinct biological differences between species. But that‘s not the whole story. Flawed trial design and investigator bias are part of the problem too.
That’s the conclusion of a paper from a team from Stanford, led by John P. A. Ioannidis. They looked specifically at research in neurological disorders, including Parkinson’s disease, stroke, MS and spinal cord injury.
Results too good to be true? Maybe so. Here’s how the authors put it:
We use a statistical technique to evaluate whether the number of published animal studies with “positive” (statistically significant) results is too large to be true. We assess 4,445 animal studies for 160 candidate treatments of neurological disorders, and observe that 1,719 of them have a “positive” result, whereas only 919 studies would a priori be expected to have such a result. According to our methodology, only eight of the 160 evaluated treatments should have been subsequently tested in humans.
... Overall, there are too many animal studies with statistically significant results in the literature of neurological disorders. This observation suggests strong biases, with selective analysis and outcome reporting biases being plausible explanations, and provides novel evidence on how these biases might influence the whole research domain of neurological animal literature.
The paper is titled “Evaluation of Excess Significance Bias in Animal Studies of Neurological Diseases
.” The findings rely on a sort of mega-meta-analysis of 160 previously published meta-analyses drawn from 4,445 animal studies. Statistically speaking, 919 of the studies could be expected to show positive results; the meta-analysis found 1,719, nearly twice as many, that claimed to be positive. Says the paper, "The literature of animal studies on neurological disorders is probably subject to considerable bias."
Meta-analysis is a common way to mine consistencies in “effect size” amidst projects that do not share common design or methodology. Effect size might be something like “animals were able to walk X percent better on treatment as on placebo.” Meta mining is a very math-heavy data crunch, using statistical models to draw a single conclusion from many variations. In this case, it is also used to quantify investigator bias.
Here’s the playbook from the new paper; I admit, I am going to have to take their word for it.
We also tested for between-study heterogeneity estimating the p-value of the χ2-based Cochran Q test, and the I2 metric of inconsistency. Q is obtained by the weighted sum of the squared differences of the observed effect in each study minus the fixed summary effect I2 ranges from 0% to 100% and describes the percentage of variation across studies that is attributed to heterogeneity rather than chance.
To the point: Bias is not good. Bias in an animal experiment can result in inert of even harmful substances moving on to clinical trials, exposing patients to unnecessary risk and wasting scarce research money. How do you spot bias? The Ioannidis group used what they call the “excess significance test:”
This examines whether too many individual studies in a meta-analysis report statistically significant results compared with what would be expected under reasonable assumptions about the plausible effect size. The excess significance test has low power to detect bias in single meta-analyses with limited number of studies, but a major advantage is its applicability to many meta-analyses across a given field.
When you think of bias you generally think conflict of interest – your research needs to show a positive result because your sponsor, perhaps a drug company, wants to make news and sell product. But bias is quite often less obvious. Bias can occur when a scientist chooses to look at data in a way that it offers a better result. Better result means publication in a more high-profile journal. And yes, the journals play a big role into the bias picture too. They reward positive results.
So, what to do. Animal studies, per se, remain valuable. "Some researchers have postulated that animals may not be good models for human diseases," said Ioannidis. "I don't agree. I think animal studies can be useful and perfectly fine. The problem is more likely to be related to the selective availability of information about the studies conducted on animals."
The Tyranny of the Impact Factor
In a blog
article that accompanied the Ioannidis paper, Roli Roberts, an editor at PLOS Biology, suggests part of the remedy for bias is better animal design and analysis, including more consistent use of ARRIVE
(Animals in Research: Reporting In Vivo Experiments).
A description of ARRIVE from PLOS Biology:
The ARRIVE guidelines consist of a checklist of 20 items describing the minimum information that all scientific publications reporting research using animals should include, such as the number and specific characteristics of animals used (including species, strain, sex, and genetic background); details of housing and husbandry; and the experimental, statistical, and analytical methods (including details of methods used to reduce bias such as randomization and blinding). All the items in the checklist have been included to promote high-quality, comprehensive reporting to allow an accurate critical review of what was done and what was found.
Roberts also thinks it would be a good idea if the research community shared more negative results. “Animal studies (like human clinical trials) should be pre-registered so that publication of the outcome, however negative, is ensured."
There are few incentives to publish negative results, Roberts concedes. “Institutions and funding bodies need to release themselves from the tyranny of the impact factor and view positive and negative results as equally valid contributions to the literature.... Authors need to recognize that negative studies can contribute substantially to scientific knowledge, both via meta-analyses and by more informal means, and it is their duty to ensure that failure to submit these studies doesn’t bias the literature."
As for John Ioannidis. this is not the first time he has forced the biological science community to take a good hard look at its output. In 2005 he published a widely cited paper, “Why Most Published Research Findings Are False
No punches pulled. From that paper:
The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true. Conflicts of interest and prejudice may increase bias. Conflicts of interest are very common in biomedical research and typically they are inadequately and sparsely reported. Prejudice may not necessarily have financial roots. Scientists in a given field may be prejudiced purely because of their belief in a scientific theory or commitment to their own findings. Many otherwise seemingly independent, university-based studies may be conducted for no other reason than to give physicians and researchers qualifications for promotion or tenure.
Ioannidis got bashed from parts of the research community. But he's a sceintist, too, and wants only to make the process more pure. His response to criticism:
Scientific investigation is the noblest pursuit. I think we can improve the respect of the public for researchers by showing how difficult success is. Confidence in the research enterprise is probably undermined primarily when we claim that discoveries are more certain than they really are, and then the public, scientists, and patients suffer the painful refutations.
For more on Ioannidis, the November 2011 Atlantic
ran a very compelling profile.