Twisted Skepticism

Scientists are fond of placing great value on what they call skepticism: Not taking things on faith. Science versus religion, is the point. In practice this means wondering about the evidence behind this or that statement, rather than believing it because an authority figure said it. A better term for this attitude would be: Value data.

A vast number of scientists have managed to convince themselves that skepticism means, or at least includes, the opposite of value data. They tell themselves that they are being “skeptical” — properly, of course — when they ignore data. They ignore it in all sorts of familiar ways. They claim “correlation does not equal causation” — and act as if the correlation is meaningless. They claim that “the plural of anecdote is not data” — apparently believing that observations not collected as part of a study are worthless. Those are the low-rent expressions of this attitude. The high-rent version is when a high-level commission delegated to decide some question ignores data that does not come from a placebo-controlled double-blind study, or something similar.

These methodological beliefs — that data above a certain threshold of rigor are valuable but data below that threshold are worthless — are based on no evidence; and the complexities and diversity of research imply it is highly unlikely that such a binary weighting is optimal. Human nature is hard to avoid, huh? Organized religions exist because they express certain aspects of human nature, including certain things we want (such as certainty); and scientists, being human, have a hard time not expressing the same desires in other ways. The scientists who condemn and ignore this or that bit of data desire a methodological certainty, a black-and-whiteness, a right-and-wrongness, that doesn’t exist.

How to be wrong.

18 Responses to “Twisted Skepticism”

  1. LemmusLemmus Says:

    “The scientists (…) desire (…) a black-and-whiteness, a right-and-wrongness, that doesn’t exist.”

    Relatedly, it is remarkable that statistical associations are either “significant at conventional levels” or not, when the actual statistic is continuous.

  2. Andrew Gelman Says:


    Yes, but think about statistical power. Some studies just don’t have enough data to be relevant; they’re just noise machines.

  3. NE1 Says:

    Saying there is no evidence for the beliefs that some studies provide more trustworthy directional implications than others, and that anecdotes can be misleading, is just silly. Scientists seek out the most perfect studies so that they can update their Bayesian predictions most efficiently. All of these “methodological beliefs” come from a simple analysis of random sampling and the logic of science, of data.

    If you want to get down to brass tacks, start wagering. The people whose income depends on predicting the future know exactly how much to weight evidence towards a correlation or cause. And it’s not going to be as much as well-structured studies.

    We may be quick to seize upon a study’s failings because it’s often so easy to do it right. A simple calculation. A sentence of instruction to the subject. Once you know how much better it is to randomize faces in a lineup, would you ever congratulate a police department for neglecting this?

    Lemmus: the p = .05 is there as a courtesy to the reader, no? Not because there is a step-function of a conclusion’s relevancy.

  4. seth Says:

    “Some studies just don’t have enough data to be relevant.” What’s an example? I find it hard to believe there is a threshold for relevance: above the threshold, relevant, below the threshold, not relevant.

    NE1, I’m not saying there is no point being discriminating. Of course some studies are more meaningful than others. It is the black-and-whiteness of the judgments that is the problem. The way studies that don’t meet some arbitrary criterion are ignored. They really are ignored, completely discounted; I’m not making this up.

    Whether the p-must-be-.05 -or-less convention is helpful or harmful is a hard question, at least for me. It gives scientists who know very little something to shoot for; but it ignores the reality that there is no important difference between p = .04 and p = .06. It distinguishes things that are the same, in other words. Should bikes be built with permanent training wheels?

  5. michael vassar Says:

    Seth, the problem is that “the scientific method” is not an epistemology but rather an evolved set of hacks optimized for, among other things, improving the quality of naive human epistemology. All of the forms of pseudo-skepticism you complain about involve rules that deviate from optimal epistemology, but by default humans also deviate from optimal epistemology. These are the best known rules that we have, not for reaching truth, but for indoctrinating into a crowd of people so that those people can do a better job of reaching truth than they would by default.

  6. LemmusLemmus Says:

    “Lemmus: the p = .05 is there as a courtesy to the reader, no? Not because there is a step-function of a conclusion’s relevancy.”

    Oh, in the stuff that I read (I’m a sociologist), it’s pretty much treated like a “step-function”. p = .06 equals “no association”.

  7. Seth’s blog » Blog Archive » Twisted Skepticism (continued) Says:

    […] Twisted Skepticism […]

  8. seth Says:

    Michael, yeah, I agree, a binary function is better than a flat function. That’s what I meant by “gives scientists who know very little something to shoot for.” A little scientific education is a dangerous thing.

  9. michael vassar Says:

    It’s definitely worse to build bikes with permanent training wheels than to teach people to ride bikes without training wheels, but better than to have a population of bikers who are always falling down. The question is, do we (collectively) know any fairly reliable way to teach people to use th “no training wheels” version of rationality instead of just “faking it” with the “scientific method”. If so, do we (personally) know how to get there from here, e.g. how to influence the culture in the relevant manner. Honestly, I think ‘no’ and ‘no’, but I think I’m closer to the second bit than to the first.

  10. seth Says:

    Were I to teach the basics of science, I would tell a bunch of stories chosen to show the value of a wide range of evidence, experimental, non-experimental, case reports, etc. I would say: we need all types, and explain why. A story-driven approach is a lot different than the current approach (e.g., evidence-based medicine) with its emphasis on rules (do this, do that) and value judgments (this is good, that is bad).

  11. Andrew Gelman Says:


    You question my claim, “Some studies just don’t have enough data to be relevant” and write, “What’s an example?”

    Follow the link that I put in my comment. It gives an example, in excruciating detail.

  12. seth Says:

    Andrew, thanks for clarifying that. Your paper argues that the Kanazawa study was too small to have a good chance of finding results with p < .05 given the likely size of the effect. Sure. I don't agree that this means its info is useless. You write: "A study of this size is not fruitful for estimating variation on this scale." That is too strong, I believe. I don't think it's true that nothing can be learned from the Kanazawa data. The Kanazawa data make some ideas more plausible, other ideas less plausible.

  13. Tony Says:

    Susan Haack was once taking questions after a talk, and was asked “What, essentially, do you think the scientific method is?” Her (prescriptive) response was: “Trying really hard to figure out the truth.”

    The problem I have with this is that many scientific advances are made when scientists almost dogmatically hold onto beliefs in the teeth of contrary evidence. Their perseverance leads to the uncovering of evidence that does support their position …

  14. Andrew Gelman Says:


    There are no “Kanazawa data.” Kanazawa analyzed existing public data sources. You could give these data to your psych undergrads too, and if they know SPSS, they might come up with some interesting things too. What made Kanazawa’s work break the attention barrier (so that people like you and me have heard of him) was that his findings were surprising. He got surprising findings by making statistical errors. If he’d gathered his own data, I’d respect his work more. The point of my paper is not about p

  15. Andrew Gelman Says:

    Hi, Seth. The rest of my comment got cut off because I used the “less than” sign, which got interpreted as html. But I think I made the basic point above. It’s fine to respect unorthodox research, but, at some point, the work is so crappy that it’s the equivalent of reading tea leaves, or throwing darts at a newspaper and using the words to write poetry. It might provoke interesting thoughts, but I don’t consider it science. It’s more like literature, or philosophical speculation, leavened with statistical errors and irrelevant data.

  16. Darius Bacon Says:

    I just came across a ref to an article by V.S. Ramachandran titled “Creativity versus skepticism within science: more harm has been done in science by those who make a fetish out of skepticism, aborting ideas before they are born, than by those who gullibly accept untested theories.” That reminded me of this discussion — perhaps you’ll find it of interest.

  17. Gustavo Lacerda Says:


    This link seems to have the full article: “Creativity versus skepticism within science: more harm has been done in science by those who make a fetish out of skepticism, aborting ideas before they are born, than by those who gullibly accept untested theories.”

  18. Mike Brown Says:

    This may be tangential to your discussion, but have you run across The blogger there is a history teacher at Geo. Mason who has been thwarted by IRB rules intended, of course, to protect human subjects, but misapplied (he feels) to the social sciences. I’m currently going through the online IRB training through CITI and, while I totally agree that psychological and medical experiments need oversight, is it really necessary if I’m testing a user interface for data entry?

    For whatever reason, I’m seeing parallels between this series of posts and my recent reading of horror stories related to getting IRB approvals. Another article about the blog is at