Thursday, November 8, 2007

The science of cherry picking

cherry on top
When two big medical studies show conflicting results on the same subject, selection bias is the most likely cause.

The big global perspective report on food, nutrition, physical activity, and the prevention of cancer, based on the results of more than seven hundred studies, comes to the conclusion that being fat causes cancer.

Another big study by Catherine Flegal and co-workers on cause-specific excess deaths associated with underweight, overweight, and obesity, based on more than half a million person-years of long-term follow-up and on mortality data of more than two million adults from the U.S. population statistics, comes to the conclusion that being fat up to a body mass index of 30 does not cause cancer.

How can two research strategies, both based on databases of similar type and size, come to completely different results? It is even very likely that they rely in part on the same data. Population studies are very expensive, so the same data from Nurses' Health Study, NHANES, Framingham and other surveys are mined over and over again by many researchers with many different questions and hypotheses.

The statistical level of significance, in general, is a probability of 5 percent that a result is due to chance. Mining huge databases for correlations of many different factors will yield a great number of possible results. For every hundred of such correlations, five will be significant just by chance. Now imagine you are a researcher striving for publications. If you are looking for positive results to be published, you may put aside non-existent correlations because you do not find them worthwile. And most likely, if you find a correlation different from zero, you will go ahead and submit it for publication.

Where are the cherries?

Interestingly, in the global food and cancer report, most of the studies are on certain types of cancers and how they are linked to other factors. That is, from the big number of cancer cases, some have been selected by the researchers and some have been left aside. If cherry picking is part of the method, it may be very hard for a researcher to stop it before fist data sets have been analyzed, and not using it as a triage prior to publication.

The Flegal study, on the other hand, has looked at deaths from different causes, not only from cancer, and not sorting out different types of cancers. And this more global approach, with complete absence of cherry picking, has yielded a zero correlation between body weight and cancer deaths.

Death is a zero sum game

If you will not die from cancer, you may die from a heart attack, kidney failure, or Alzheimer's disease. Be it sooner or later. Any advice based on cherry-picked cancer studies will miss this point.

Photo credit:

No comments: