Anil Potti, Ranjit Chandra, and Reducing Scientific Fraud
Wednesday, September 7th, 2011An account of the genomics scandal at Duke University has appeared in Significance (a journal sponsored by British and American statistical societies). The scandal caused the end of a clinical trial — it had been based on fraudulent data — and the resignation of assistant professor Anil Potti, who had among other things falsified his resume.
It reminded me of the Ranjit Chandra case. Similarities: 1. The published results could not be reconstructed from data. In Chandra’s case, some of the results were statistically impossible. In the Potti case, two statisticians were unable to go from raw data they were given to the published results. 2. Outsiders important. Saul Sternberg and I, who are psychology professors, not nutrition professors, wrote an article that drew attention to what Chandra had done and caused retraction of one of his papers. As far as I could tell, at least a few nutrition professors had believed for many years that Chandra made up data. In Potti’s case, the deception was revealed by two statisticians. Perhaps Chandra and Potti both believed (a) hardly anyone will notice and (b) if anyone notices, they won’t do anything. 3. Incidental fabrication. In one paper, Chandra said that everyone asked to be in the study agreed to participate. The study involved having blood drawn many times. Potti claimed to be something similar to a Rhodes Scholar. 4. Found innocent. Years before Sternberg and I got involved, Chandra had been accused by his research assistant, a nurse. A Memorial University committee found him innocent of her accusations — at least, her accusations were not upheld. Chandra then sued the nurse. In the Potti case, a Duke University committee looked into the case and found no serious wrongdoing. A clinical trial based on the Potti results, which had been stopped, was resumed.
Factor 2 (outsiders important) is no surprise to readers of this blog, although the new account doesn’t mention it. But Factors 1 (reconstruction impossible) and 3 (incidental fabrication) mean that the fabrication should have been relatively easy to confirm. Yet Factor 4 seems to suggest it was hard to confirm. Factor 4 — in spite of Factors 1 and 3 — implies there is something mysterious and important going on here, more mysterious and interesting than someone lying. But I cannot say what.
The Significance article, which is by Darrel Ince, a professor of computing at the Open University, includes several suggestions for improving the system. I fail to see why they will help and they have significant costs. One of them is to put the original data and software in an independent repository. I think this would make things worse. People would continue to fake research; now. they would now also fake raw data, in addition to the graphs and tables needed for publication. In the past, thinking they wouldn’t be caught, fakers would either (a) not make up the raw data (Chandra) or (b) do so carelessly (Potti). Their overconfidence was key to catching them.
My suggestion along these lines is a requirement that researchers make available upon request the raw data and any original software. They store it themselves, in other words. If they fail to fulfill outside requests for these materials within one month, this will be grounds for immediate retraction of the paper. Without something like this, a store-it-yourself requirement means little. I once requested the raw data for a paper that had appeared in a journal that had a make-data-available policy. The authors refused my request. The editor did nothing. As A. W. Montford makes clear in The Hockey Stick Illusion, we would all be better off if Michael Mann and other authors had simply handed over the raw data behind their “hockey stick” temperature graphs when requested rather than fight a long string of FOIA battles (and mull over what emails to delete).








