The first results from a major project to measure the reliability of cancer research have highlighted a big problem: Labs trying to repeat published experiments often can't.
That's not to say that the original studies are wrong. But the results of a review published Thursday, in the open-access journal eLife, are a sobering reminder that science often fails at one of its most basic requirements — an experiment in one lab ought to be reproducible in another one.
And the fact that they often aren't could have big health implications. Many exciting ideas in cancer research never pan out. One reason is that findings from the initial studies don't stand the test of time.
Nosek is also a psychology professor at the University of Virginia. A few years ago, he organized a similar effort to examine research in his field. And his results garnered worldwide attention when two-thirds of the original findings in psychology couldn't be reproduced.
Nosek decided to explore the work from cancer biology labs after two high-profile studies, from drugmakers Bayer and Amgen, reported dismal results when they tried to reproduce some cancer papers. Only 25 percent of the papers Bayer examined were reproduced. Amgen was able to replicate only six out of the 53 studies it examined.
"Those were earthshaking reports, in the sense that the community responded very strongly to these reports of challenges to reproduce some of these core findings," Nosek says.
But scientists at Bayer and Amgen wouldn't say which experiments they examined, so their work raised many questions but left no way for scientists to answer them.
"The cancer reproducibility project in cancer biology was an attempt to advance that discussion with an open project," Nosek says.
This project is transparent about how it picked the studies to reproduce. It also published methods and study plans in advance. In collaboration with a California company called Science Exchange, the reviewers got grants to replicate key experiments from as many as 50 high-profile studies. (They will very likely run out of money before they're able to complete that work, however.)
They've now published the results of their first five attempts, in eLife.
"Three of the five show very, very striking differences from the original," says Timothy Errington, a biologist at the Center for Open Science and collaborator in the project. As for findings from the other two studies, he says, "I think you'll get a lot of opinions about whether they replicate or not."
Errington says he was quite surprised by the results.
In one case, the original scientists went the extra mile to help the labs doing the follow-up studies reduce potential sources of error. "The lab gave us the same drug. This is wonderful. Because that could have been a sticking point," Errington says. "They gave us the same tumor cells that they used."
Yet the replicating lab didn't end up with the same results.
Scientists have had so much confidence in two of the original studies that drug companies already have sunk millions of dollars into efforts to try the concepts out in people. But the follow-up experiments for one of those didn't validate the original results.
The inevitable question is whether the original science was wrong, or whether the scientists who tried to repeat that work somehow got tripped up.
The review project farmed out its actual laboratory work to commercial labs that perform experiments for the pharmaceutical industry, or to university "core facilities," such as centralized labs that do a lot of research on mice. Those labs generally work to standards required by the Food and Drug Administration.
But research with living systems is never simple, so there are many possible sources of variation in any experiment, ranging from the animals and cells to the details of lab technique.
And there isn't even clear agreement about when a study's findings can be considered to have been reproduced.
Sean Morrison, an editor at eLife and a Howard Hughes Medical Institute investigator at the University of Texas Southwestern Medical Center, says that by his count, two studies' findings were substantially reproduced. The findings of one other were not, he says, and two others have results that simply can't be interpreted.
"One of the difficulties of the reproducibility project is they have limited time and resources to spend on any one study," Morrison says. "As a result, they can't go back and do these things over and over again when the first results turn out to be uninterpretable."
Errington agrees that the reproducibility project leaves that big question hanging — but the scientists don't plan to answer it.
"As exciting as that is, and as important as that is — and we hope someone else will follow up on it — we're more curious about, 'What does that look like when we do this across many, many, many studies.' "
But Dr. Erkki Ruoslahti, at the nonprofit Sanford Burnham Prebys Medical Discovery Institute in La Jolla, Calif., is worried that the reproducibility project could do real damage. The reviewers couldn't reproduce his original study but didn't follow up to understand why.
"I am really worried about what this will do to our ability to raise funding for our clinical development," he writes in an email to Shots. "If we, and the many laboratories who have reproduced our results, are right and the reproducibility study is wrong — which I think is the case — they will not be doing a favor to cancer patients."
Dr. Irving Weissman, a professor of pathology and developmental biology at Stanford University, is also disappointed in how the reproducibility project handled his experiment. His paper reported finding a protein that's present on all human cancer cells — a finding that Weissman says has been replicated many times in other labs.
The reproducibility project chose to repeat a peripheral part of Weissman's paper — an experiment involving mice, not human tissues. And, Weissman says, the replicating lab stumbled over an early step in the experiment, but plowed ahead anyway.
Weissman says he offered to bring scientists into his lab to train them in the technique, but the Reproducibility Project didn't do that. (That would undercut one of its goals, which is to see whether scientists working independently can verify published results.)
It's important to replicate important studies, Weissman tells Shots, "but you can't do it halfheartedly. You have to be serious about it."
Errington and Nosek hope people who hear about the project's findings don't jump to any conclusions about why individual studies came to different conclusions. They're trying to look at the big picture across dozens of studies, the two scientists say, and they don't place too much confidence in any single result.
The reproducibility project is looking for patterns across cancer research and also trying to identify common reasons that labs might have trouble reproducing one another's work. Are the directions offered in the methods section of a paper too sketchy? Or maybe experiments frequently work only under unusual conditions.
Morrison, who is involved as a journal editor rather than a participant, says the entire reproducibility project is itself one big experiment.
"I think it's too early for us to know whether this approach is the right approach or the best approach for testing the reproducibility of cancer biology," he says. "But it will be a data point, and it will start the conversation."
The conversation is important because the vast majority of treatment ideas that come from the lab fail when they're tried in people. Cathy Tralau-Stewart, a pharmacologist at the University of California, San Francisco, says scientists often don't know why those clinical failures occur, "and so that's why I think studies like this are really, really important."
Unfortunately, Nosek says, there are few incentives today for scientists to repeat experiments from other labs. The rewards are for publishing new ideas, not the less glamorous, but still critical, work of verifying somebody else's findings.
"If we're going to take reproducibility seriously," Nosek says, experiments that attempt to reproduce the findings of others "need to be a valued part of scientific contribution."
DAVID GREENE, HOST:
There are a lot of exciting ideas in cancer research out there, but they never pan out. And one reason is initial studies don't hold up to scrutiny. Well, a research group is trying to figure out how big a problem this is. And the first results suggest it's just not easy for scientists to reproduce the work of others. Here's NPR's Richard Harris.
RICHARD HARRIS, BYLINE: A few years ago, Brian Nosek stunned his field of psychology when he and his colleagues determined that many of those experiments couldn't be repeated successfully. Next, he turned his attention to laboratory cancer research.
BRIAN NOSEK: Reproducibility is a central feature of how science is supposed to be, and it's not clear to what extent it is happening in practice.
HARRIS: In 2011, scientists at the companies Bayer and Amgen each tried to redo dozens of promising cancer experiments to see if they could get the same results. The vast majority of those attempts failed.
NOSEK: The community responded very strongly to these reports of challenges to reproduce some of these core findings.
HARRIS: But Bayer and Amgen wouldn't say which experiments they examined. So their work raised questions but left no way for scientists to follow up.
NOSEK: And so the reproducibility project in cancer biology was an attempt to sort of advance that discussion with an open project.
HARRIS: Nosek, at the Center for Open Science and the University of Virginia, made sure this project was transparent about how it picked the studies, transparent about their methods and their study plans. They got grants to replicate key experiments from up to 50 high-profile studies in collaboration with a company called Science Exchange. They are now publishing the results of their first five attempts in the journal eLife.
Timothy Errington is a biologist at the Center for Open Science.
TIMOTHY ERRINGTON: Three of the five show very, very striking differences from the original. And then there's two of them that, I think, you'll get a lot of different opinions on whether they, quote, unquote, "replicate" it or not.
HARRIS: Did these initial results surprise you?
ERRINGTON: Yeah. Oh, yeah.
HARRIS: Some of the original labs cooperated while others didn't. In one case, the original scientists went the extra mile to help the follow-up labs reduce potential sources of error.
ERRINGTON: The lab gave us the same drug. This is wonderful - right? - 'cause, like, that could have been a sticking point. Oh, wow, they gave us the same tumor cells that they used.
HARRIS: But the replicating lab didn't end up with the same results. The inevitable question then is whether the original science was wrong or whether the scientists who tried to repeat that work tripped up. In two cases, the original scientists are so confident in their findings that they've been able to draw millions of dollars in investment to start developing new drugs. Sean Morrison, an editor at eLife and a prominent cancer biologist, notes that it can take months or years to perfect the laboratory techniques used in any given experiment.
SEAN MORRISON: And one of the difficulties of the reproducibility project is that they have limited time and resources to spend on any one study. And as a result, they can't go back and do these things over and over again when the first attempts turn out to be uninterpretable.
HARRIS: Errington agrees that their results leave that very important question hanging.
ERRINGTON: And as exciting as that is and important as that is - and hopefully, somebody does follow up on it - we're a bit more curious on - well, what does that look like when we do it across many, many, many studies?
HARRIS: They're looking for patterns across cancer research and also trying to identify common reasons that labs might have trouble reproducing one another's work. Are the directions offered in a paper too sketchy? Or maybe an experiment only works under certain unusual conditions. Morrison at eLife says the entire reproducibility project is itself one big experiment.
MORRISON: I think it's too early for us to know whether this approach is the right approach or the best approach for testing the reproducibility of cancer biology. But it'll be a data point, and it'll start the conversation.
HARRIS: The conversation is important because the vast majority of treatment ideas that come from this basic science fail when they're tried in people. Cathy Tralau-Stewart at UC San Francisco says scientists often don't know why those failures occur.
CATHY TRALAU-STEWART: And so that's why I think these sort of studies are really, really important.
HARRIS: The point isn't who's right and wrong but all about the why.
Richard Harris, NPR News.
(SOUNDBITE OF RUSPO SONG, "FILOMENA") Transcript provided by NPR, Copyright NPR.