Archive for the 'scientific method' Category

Is Epidemiology Worthless? The Case of Calcium

Thursday, January 5th, 2012

Epidemiology has lots of critics. In this article, for example, it is called “lying on a grand scale.” Every critique I have read has ignored history. Epidemiologists have been right about two major issues: 1. Heavy smoking causes lung cancer. 2. Folate deficiency causes birth defects. In both cases, the first evidence was epidemiological. Another example is John Snow’s conclusion about the value of clean water. In my experience, epidemiologists often overstate the strength of their evidence (as do most of us) but overstatement is quite different from having nothing worth saying.

Let’s look at an example. Many people think osteoporosis is due to lack of calcium. Bones are made of calcium, right? The epidemiology of hip fractures is clear. In spite of the conventional idea, the rate of hip fracture has been highest in places where people eat a lot of calcium, such as Sweden, and lowest in places where they eat little, such as Hong Kong. (For example.) In other words, the epidemiology flatly contradicted the conventional idea. This was apparently ignored by nutrition experts (everyone knows correlation does not equal causation) who advised millions of people, especially women, to take calcium supplements  to avoid osteoporosis. Millions of people followed (and follow) that advice.

Thanks to a recent meta-analysis we now know that experiments and better data firmly support the earlier epidemiology, which suggested that calcium supplements are dangerous. Here are its main conclusions:

In meta-analyses of placebo controlled trials of calcium or calcium and vitamin D, complete trial-level data were available for 28,072 participants from eight trials of calcium supplements and the WHI CaD participants not taking personal calcium supplements. . . .Calcium or calcium and vitamin D increased the risk of myocardial infarction (relative risk 1.24 (1.07 to 1.45), P = 0.004) and the composite of myocardial infarction or stroke (1.15 (1.03 to 1.27), P = 0.009). . . . A reassessment of the role of calcium supplements in osteoporosis management is warranted.

If the epidemiology had been taken more seriously, many heart attacks might have been avoided.

Is this an “anecdote” — a single example — proving nothing? Here’s how you can check. Randomly select a meta-analysis of epidemiological studies. Thousands have been done. Then ask if the results summarized in the meta-analysis appear random. Better yet, randomly pick two meta-analyses. Suppose the first summarizes 5 studies and the second summarizes 6. If the 11 results were shuffled together, how well could you assign them correctly?

Justification For Self-Experimentation and My Belief that N=1 Results Will Generalize

Friday, December 16th, 2011

At the Quantified Self blog, in response to a video of me talking about QS and the Ancestral Health Symposium (paleo), someone named Colin made the following comment:

Very interesting talk. I am just curious how someone can claim a study conducted with a sample size of one is “100 times better” than someone else’s study. I do not know anything about the other study mentioned, but I do know that a study based on n=1 cannot be considered scientific proof. And sure, he hears from people who have lost weight drinking the sugar water he prescribed, but it is quite possible there are 100 times as many people who didn’t email him because they didn’t see any positive results and decided to try something else. I think the QS stuff is very interesting and helpful on a personal level, but it seems like a stretch to generalize your results to others.

I responded:

I have two responses.

1. Sample size isn’t everything. Sure, a study with n=1 isn’t “scientific proof”. Nor is any other study, in my experience. “Scientific proof” has always required many studies. New scientific ideas have very often started with n = 1 experiments or observations. Later, larger experiments or observations were done. Both — the initial n=1 observation and the later n = many observations — were necessary for the new idea to be discovered and confirmed.

2. The history of biology teaches there are few exceptions to general rules. See any biology textbook. For example, a textbook might say “lymphocytes fight germs”. This means no serious exceptions have ever been found to that rule. So, as matter of biological history, the person who managed to figure out what one particular lymphocyte does turned out to have figured out what they all do. Biology textbooks have thousands of statements like “lymphocytes fight infection” meaning that this sequence of events (you can generalize from one to all, or nearly all) has happened thousands of times. There is no shadow hidden history of biology that teaches otherwise.

Gelman and Fung versus Levitt and Dubner: How “Wrong” is Freakonomics?

Thursday, December 15th, 2011

In the latest issue of American Scientist, Andrew Gelman (an old friend) and Kaiser Fung criticize Freakonomics and Superfreakonomics by Steve Levitt and Stephen Dubner (who wrote about my work). Although the article is titled “Freakonomics: What Went Wrong?” none of the supposed errors are in Freakonomics. You can get an idea of the conclusions from the title and this sentence: “How could an experienced journalist and a widely respected researcher slip up in so many ways?”

Gelman and Fung examine a series (“so many ways”) of what they consider mistakes. I will comment on each of them.

1. The case of the missing girls. I agree with Gelman and Fung: Levitt and Dubner accepted Emily Oster’s research too uncritically.

2. The risk of driving a car. I think Gelman and Fung miss the point. Yes, the claim (driving drunk is safer than walking drunk) was not well-supported by the evidence provided because the comparison was so confounded. However, I read the whole example differently. I didn’t think that Levitt and Dubner thought drunk people should drive. I thought their point was more subtle — that comparisons are difficult (“look how we can reach a crazy conclusion”).

3. Stars are made not born. I think Gelman and Fung fail to see the big picture. The birth-month effect in professional sports, which Gelman and Fung dismiss as “very small,” is of great interest to many people, if not to Gelman and Fung.  It suggests what Levitt and Dubner and Gladwell and others say: Early success matters. That’s not obvious at all. There are lots of similar associations in epidemiology. They have been the first evidence for many important conclusions, such as smoking causes lung cancer. Are professional sports important? Maybe. But epidemiology and epidemiological methods are surely important. By learning about this effect, we learn about them. Lots of smart people fail to take epidemiology seriously enough (e.g., “correlation does not equal causation”).

4. Making the majors and hitting a curve ball. Gelman and Fung point out that one sentence is misleading. One sentence. This is called praising with faint damn.

5. Predicting terrorists. Gelman and Fung say that the terrorist prediction algorithm of a man named Ian Horsley, which Levitt and Dubner seem to take seriously, is not practical. But their review fails to convince me it was presented as practical. Since there are no data about how well the algorithm works, and Levitt and Dubner are all about data….

6. The climate change dust-up. I agree with Gelman and Fung that Nathan Myrvold’s geoengineering ideas are unimportant. (My view of Myrvold’s patent trolling.)  But in this case, I’d say both sides — Gelman and Fung and Levitt and Dubner — miss what’s really important, namely that the usual claims that humans are dangerously warming the planet are held far too strongly. The advocates of this view are far too sure of themselves. I have blogged about this many times. In a nutshell, the climate models that we are supposed to trust have never been shown to persuasively predict the climate ten or twenty years from now (or even one year from now). There is no good reason to believe them. That Levitt and Dubner seem to take that stuff seriously is the only big criticism I have of their work . At least in that geoengineering stuff Levitt and Dubner were dissenting from conventional wisdom. Gelman and Fung do not. They fail to realize that something we’ve been told thousands of times is nonsense (in the sense of being wildly overstated). It was Levitt and Dubner’s comments about this that led me to look closely at all that climate-change scare stuff. I was surprised how poor the evidence was.

The biggest problem with Gelman and Fung’s critique is that they say nothing about the great contribution of Steve Levitt to economics. They fail to grasp that he has made economics considerably more of a science, if by science you mean a data-driven enterprise as opposed to an ideologically-driven or prestige-driven one (mathematics is prestigious, the more difficult, the more prestigious). He did so by pioneering a new way to use data to learn interesting things. His method is essentially epidemiological, except his methods are considerably better (better matching, less formulaic) and his topics much more diverse (e.g., sumo wrestling) than mainstream epidemiology. A large fraction of prestige economics is math, divorced from empirical tests. This stuff wins Nobel Prizes, but, in my and many other people’s opinion, contributes very little to understanding. (Psychology has had the same too much math, too little data problem — minus the Nobel Prizes, of course.) To persuade a big chunk of an entire discipline to pay more attention to data is a huge accomplishment.

Levitt’s methodological innovation makes Freakonomics far from what Gelman and Fung call “pop statistics”. It is actually an amusing and well-written record of something close to a revolution. In the 1980s, a friend of mine at UC Berkeley took an introductory economics class. She told me a little of what the teacher said in class. All theory. What about data? I said. It’s a strange science that doesn’t care about data. My friend went to office hours. She asked the instructor (a Berkeley economics professor): What about data? Don’t worry about data, he replied. Gelman and Fung fail to appreciate what economics used to be like. The ratio of strongly-asserted ideas to persuasive data used to be very large. Now it is less.

Thanks to Ashish Mukharji.

Assorted Links

Saturday, December 3rd, 2011
  • Top ten excuses for climate scientists behaving badly. For example, “the emails are old” and “the timing is suspicious”.
  • Scientific retractions are increasing. My guess is that retractions are increasing because scientific work has become easier to check. Tools are cheaper, for example.
  • More Dutch scientific misconduct. “Professor Poldermans published more than 600 scientific papers in a wide range of journals, including JAMA and the New England Journal of Medicine.”
  • The next time someone praises “evidence-based medicine”, ask them: What about Accutane? It illustrates how evidence-based medicine encourages dangerous drugs. You can’t make lots of money from cheap, time-tested things that we know to be safe (such as dietary changes) so the drug industry revolves around things that are not time-tested and therefore dangerous  — far more dangerous than dietary changes. Evidence-based medicine, which says that certain tests (expensive) are much better than other tests (cheap), provides cover for this. Because the required tests are so expensive, they are allowed to be short.

Thanks to Allan Jackson.

Assorted Links

Sunday, November 27th, 2011
  • Salem Comes to the National Institutes of Health. Dr. Herbert Needleman is harassed by the lead industry, with the help of two psychology professors.
  • Climate scientists “perpetuating rubbish”.
  • A humorous article in the BMJ that describes evidence-based medicine (EBM) as a religion. “Despite repeated denials by the high priests of EBM that they have founded a new religion, our report provides irrefutable proof that EBM is, indeed, a full-blown religious movement.” The article points out one unquestionable benefit of EBM — that some believers “demand that [the drug] industry divulge all of its secret evidence, instead of publishing only the evidence that favours its products.” Of course, you need not believe in EBM to want that. One of the responses to the article makes two of the criticisms of EBM I make: 1. Where is the evidence that EBM helps? 2. EBM stifles innovation.
  • What really happened to Dominique Strauss-Kahn? Great journalism by Edward Jay Epstein.  This piece, like much of Epstein’s work, sheds a very harsh light on American mainstream media. They were made fools of by enemies of Strauss-Kahn. Epstein is a freelance journalist. He uncovered something enormously important that all major media outlets — NY Times, Washington Post, The New Yorker, ABC, NBC, CBS (which includes 60 Minutes), the AP, not to mention French news organizations, all with great resources — missed.

Evidence-Based Medicine Versus Innovation

Saturday, November 19th, 2011

In this interview, a doctor who does research on biofilms named Randall Wolcott makes the same point I made about Testing Treatments — that evidence-based medicine, as now practiced, suppresses innovation:

I take it you [meaning the interviewer] are familiar with evidence-based medicine? It’s the increasingly accepted approach for making clinical decisions about how to treat a patient. Basically, doctors are trained to make a decision based on the most current evidence derived from research. But what such thinking boils down to [in practice -- theory is different] is that I am supposed to do the same thing that has always been done – to treat my patient in the conventional manner – just because it’s become the most popular approach. However, when it comes to chronic wound biofilms, we are in the midst of a crisis – what has been done and is accepted as the standard treatment doesn’t work and doesn’t meet the needs of the patient.

Thus, evidence-based medicine totally regulates against innovation. Essentially doctors suffer if they step away from mainstream thinking. Sure, there are charlatans out there who are trying to sell us treatments that don’t work, but there are many good therapies that are not used because they are unconventional. It is only by considering new treatment options that we can progress.

Right on. He goes on to say that he is unwilling to do a double-blind clinical trial in which some patients do not receive his new therapy because “we know we’ve got the methods to save most of their limbs” from amputation.

Almost all scientific and intellectual history (and much serious journalism) is about how things begin. How ideas began and spread, how inventions are invented. If you write about Steve Jobs, for example, that’s your real subject. How things fail to begin — how good ideas are killed off — is at least as important, but much harder to write about. This is why Tyler Cowen’s The Great Stagnation is such an important book. It says nothing about the killing-off processes, but at least it describes the stagnation they have caused. Stagnation should scare us. As Jane Jacobs often said, if it lasts long enough, it causes collapse.

Thanks to Heidi.

Assorted LInks

Friday, November 11th, 2011

Thanks to Dave Lull and Alex Chernavsky.

Testing Treatments: Nine Questions For the Authors

Sunday, November 6th, 2011

From this comment (thanks, Elizabeth Molin) I learned of a British book called Testing Treatments (pdf), whose second edition has just come out. Its goal is to make readers more sophisticated consumers of medical research. To help them distinguish “good” science from “bad” science. Ben Goldacre, the Bad Science columnist, fulsomely praises it (“I genuinely, truly, cannot recommend this awesome book highly enough for its clarity, depth, and humanity”). He wrote a foreword. The main text is by Imogen Evans (medical journalist), Hazel Thornton (writer),  Iain Chalmers (medical researcher), and Paul Glaziou (medical researcher, editor of Journal of Evidence-Based Medicine).

To me, as I’ve said, medical research is almost entirely bad. Almost all medical researchers accept two remarkable rules: (a) first, let them get sick and (b) no cheap remedies. These rules severely limit what is studied. In terms of useful progress, the price of these limits has been enormous: near total enfeeblement. For many years the Nobel Prize in Medicine has documented the continuing failure of medical researchers all over the world to make significant progress on all major health problems, including depression, heart disease, obesity, cancer, diabetes, stroke, and so on. It is consistent with their level of understanding that some people associated with medicine would write a book about how to do something (good science) the whole field manifestly can’t do. Testing Treatments isn’t just a fat person writing a book about how to lose weight, it’s the author failing to notice he’s fat. (more…)

Brain Surprise! Why Did I Do So Well?

Monday, October 31st, 2011

For the last four years or so I have daily measured how well my brain is working by means of balance measurements and mental tests. For three years  I have used a test of simple arithmetic (e.g, 7 * 8, 2 + 5). I try to answer as fast as possible. I take faster answers to indicate a better-functioning brain.

Yesterday my score was much better than usual. This shows what happened.

My usual average is about 550 msec or more; my score yesterday was 525 msec. An unexplained improvement of 25 msec.

What caused the improvement? I came up with a list of ways that yesterday was much different than usual, that is, was an outlier in other ways. These are possible causes. From more to less plausible:

1. I had 33 g extra flaxseed last night. (By mistake. I’m not sure about this.)

2. The test came at the perfect time after I had my afternoon yogurt with 33 g flaxseed. When I took flaxseed oil (now I eat ground flaxseed), it was clear that there was a short-term improvement for a few hours.

3. Many afternoons I eat 33 g ground flaxseed with yogurt. Yesterday I ground the afternoon flaxseed an unusually long time, making made the omega-3 more digestible.

4. I did kettlebells swings and a kettlebell walk about 2 hours before the test. These exercises are not new but usually I do them on different days. Yesterday was the first time I’ve done them on the same day. I’m sure ordinary walking improves performance for perhaps 30 minutes after I stop walking.

5. I had duck and miso soup a half-hour before the test. Almost never eat this.

6. I had a fermented egg (“thousand-year-old egg”) at noon. I rarely eat them.

7. I had peanuts with my yogurt and ground flaxseed. Peanuts alone seem to have no effect. Perhaps something in the peanuts improves digestion of the omega-3 in the flaxseed.

8. I started watching faces at 7 am that morning instead of 6:30 am or earlier.

Here are eight ideas to test. Perhaps one or two will turn out to be important. Perhaps none will.

After I made this list, I read student papers. The assignment was to comment on a research article. One of the articles was about the effect of holding a warm versus cold coffee cup. Holding a warm coffee cup makes you act “warmer,” said the article. Commenting on this, a student said she thought it was ridiculous until she remembered going to the barber. She sees the person who washes her hair (in warm water) as friendly, the barber as cold. Maybe this is due to the warm water used to wash her hair, she noted. This made me realize another unusual feature of yesterday: I had washed my hair in warm water longer than usual. I think I did it at least 30 minutes before the arithmetic test but I’m not sure. In any case, here is another idea to test. I found earlier that cold showers slowed down my arithmetic speed.

This illustrates a big advantage of personal science (science done for personal gain) over professional science (science done because it’s your job): The random variation in my life may suggest plausible new ideas. As far as I can tell, professional scientists have learned almost nothing about practical ways to make your brain work better. You can find many lists of “brain food” on the internet. Inevitably the evidence is weak. I’d be surprised if any of them helped more than a tiny amount (in my test, a few msec). The real brain foods, in my experience, are butter and omega-3. Perhaps my tests will merely confirm the value of omega-3 (Explanations 1-3). But perhaps not (Explanations 4-8 and head heating).

Nobel Prize Report Card: Economics

Thursday, October 13th, 2011

The Nobel Prizes awarded each year resemble a kind of report card where each prize-worthy discipline (Physics, Chemistry, etc.) gets a grade that depends on the prize-winning research. If the prize-winning research is useful and surprising, the grade is high. If not the grade is low. More generally, at least to me, the intellectual history of the prize winners sheds light on the whole profession. Perhaps some biologists were unaware of the behavior of Eric Kandel described in Explorers of the Black Box when he was awarded the biology prize. Kandel, I hasten to add, is an unusual case.

Thomas Sargent is one of the winners of this year’s Economics prize. In 2007, he gave a graduation speech at Berkeley to economics majors (via Marginal Revolution). In the speech, Sargent called economics “organized common sense”. He went on to list 12 common-sense ideas, such as “Individuals and communities face trade-offs” and “governments and voters respond to incentives” that economists believe. The reasons for their belief weren’t stated.

When I started as a professor (at Berkeley) I did many experiments with rats and, to my annoyance, discovered an inconvenient truth: I understood rats less well than I thought. Even in a heavily-controlled heavily-studied situation (Skinner box), my rats often did not do what I expected. My common sense was often wrong, in other words. This experience made me considerably more skeptical of other people’s “common sense”.

To me, and I think to most scientists, science begins with common sense. Experimental psychology certainly does. I used common sense to design my experiments. Had I not done those experiments, I would not have learned that my common sense was wrong. So relying on common sense was helpful — as a place to start. As a way to begin to understand. You begin with common-sense ideas and you test them. That common sense is often wrong is a theme of Freakonomics, in agreement with my experience. Yet Sargent seemed content (he called economics “our beautiful subject”) to end with common sense, perhaps tidied up.

This is really unfortunate because economics, beautiful or not, is so important. If you ignore data, the answer to every hard question is the same: the most powerful people are right. That way lies stagnation (problems build up unsolved because powerful people prefer the status quo) and collapse (when the problems become overwhelming). Alan Greenspan’s faith-based belief in free markets and the 2008 financial crisis — after Sargent’s speech — is an example. In 2009, Sargent’s speech might have been less well-received.