Bad Analysis: 2007

Thursday, December 13, 2007

Is a Carl Doomed To Be C Student?

A study reported widely last month found that "students whose names start with the letters C or D, which denote mediocre marks in some grading systems, did not perform as well as other pupils with different initials."

As Carl Bialik points out in the Wall Street Journal, while the relationship that the researchers found was statistically significant, it doesn't mean it is important:

"University of California, Irvine, statistician Hal Stern points out something most media missed. The effect is tiny: 0.02 of a grade-point average point lower for the initials C and D (and this columnist isn't including that because of his first initial). Therein lies a lesson in the difference between statistical significance -- the confidence that there is some association between two factors -- and the strength of that association.

"In very large samples like the ones here, even small differences will be judged statistically significant," Prof. Stern says. "This means that we're confident the difference is not zero. It does not mean the difference we see is important." Prof. Nelson agrees that this effect is "so small that you shouldn't worry about it" when naming a child, though he does say the study exposes an example of how the unconscious mind can undermine conscious motivation.

But Bowling Green statistician Jim Albert warns: "You can prove any silly hypothesis ... by running a statistical test on tons of data.""

Thursday, November 29, 2007

Citing Statistics, Giuliani Misses Time and Again

New York Times article on Rudy Giuliani's record reporting the facts during his presidential campaign.

Saturday, November 17, 2007

Four Pinocchios for Ron Paul (The Fact Checker)

Good analysis of Ron Paul's claim that we could eliminate personal income taxes and still balance the budget.

Friday, November 9, 2007

Rudy Giuliani, amateur epidemiologist

There have been many articles written about poor use of prostate cancer statistics by Rudy Giuliani. The main complaint is that he uses faulty logic to calculate prostate cancer survival rates. However, I think the bigger problem with Giuliani's numbers is that he cherry picks numbers to prove his points, providing no context. Once again, this appears to be a complicated issue and simplistic analysis by a politician.

Background
First, here's the background. According to a Giuliani radio ad:

"I had prostate cancer, five, six years ago. My chance of surviving prostate cancer, and thank God I was cured of it, in the United States, 82 percent. My chances of surviving prostate cancer in England, only 44 percent under socialized medicine."

As pointed out by factcheck.org and The Fact Checker at the Washington Post, the 44% figure was arrived at by simplistically dividing per capita prostate cancer mortality by per capita prostate cancer diagnoses (and subtracting that figure from 100%). Unfortunately, the people diagnosed in a given year are not the same people that die in that year, so you can't figure out what your odds of surviving prostate cancer by using this data. To determine survival rate, you need to follow the same population over a period of time (5-year survival rate is the standard metric). The 5-year survival rate in the U.K. is 74.4%.

The bigger problem
Clearly, it is troubling that a guy who wants to be part of the debate on the nation's health care doesn't have anyone on his staff that really understands the data. Equally troubling is that even after the doctors whose study he bases his claims pointed out his error, he continued to use the misleading numbers. But even if Giuliani used the proper numbers (which are 5-year survival rates of 74% in the U.K. and 98% in the U.S.), the conclusion he draws is simplistic at best.

Let me give you some statistics that would seem to refute Giuliani's conclusion about socialized medicine:

Average life expectancy in the U.K. is 79.4 years, compared to only 78.2 in the U.S.
Infant mortality rate in the U.K. is 5.01 per thousand, compared to 6.37 in the U.S.
Health care spending per capita in the U.K. is $1,675, compared to $4,271 in the U.S.

Does this mean that socialized medicine is better than private medicine? In my opinion, that would be an equally simplistic conclusion. We need to control for a lot of things in order for the data to be meaningful: population demographics, approach toward prevention, detection and treatment, etc. Unfortunately, intellectually honest analysis doesn't seem to be a Giuliani strong point.

Saturday, September 15, 2007

Most Science Studies Appear to Be Tainted By Sloppy Analysis (Wall Street Journal)

Interesting article by Robert Lee Holtz at the Wall Street Journal about mistakes in published research. He references the work of Dr. John Ioannidis, who studies research methods at the University of Ioannina School of Medicine in Greece and Tufts University. It also references an even more interesting paper he published in 2005.

From the article:

"Flawed findings, for the most part, stem not from fraud or formal misconduct, but from more mundane misbehavior: miscalculation, poor study design or self-serving data analysis. 'There is an increasing concern that in modern research, false findings may be the majority or even the vast majority of published research claims,' Dr. Ioannidis said. 'A new claim about a research finding is more likely to be false than true.'"

"The hotter the field of research the more likely its published findings should be viewed skeptically, he determined."

"Take the discovery that the risk of disease may vary between men and women, depending on their genes. Studies have prominently reported such sex differences for hypertension, schizophrenia and multiple sclerosis, as well as lung cancer and heart attacks. In research published last month in the Journal of the American Medical Association, Dr. Ioannidis and his colleagues analyzed 432 published research claims concerning gender and genes."

"'Overeager researchers often tinker too much with the statistical variables of their analysis to coax any meaningful insight from their data sets. People are messing around with the data to find anything that seems significant, to show they have found something that is new and unusual,' Dr. Ioannidis said."

From the paper:
The paper lays out several factors that influence the probability of the results of a study being true:

Corollary 1: The smaller the studies conducted in a scientific field, the less likely the research findings are to be true ... other factors being equal, research findings are more likely true in scientific fields that undertake large studies, such as randomized controlled trials in cardiology (several thousand subjects randomized) than in scientific fields with small studies, such as most research of molecular predictors (sample sizes 100-fold smaller).

Corollary 2: The smaller the effect sizes in a scientific field, the less likely the research findings are to be true ... research findings are more likely true in scientific fields with large effects, such as the impact of smoking on cancer or cardiovascular disease (relative risks 3–20), than in scientific fields where postulated effects are small, such as genetic risk factors for multigenetic diseases (relative risks 1.1–1.5). Modern epidemiology is increasingly obliged to target smaller effect sizes. Consequently, the proportion of true research findings is expected to decrease.

Corollary 3: The greater the number and the lesser the selection of tested relationships in a scientific field, the less likely the research findings are to be true ... research findings are more likely true in confirmatory designs, such as large phase III randomized controlled trials, or meta-analyses thereof, than in hypothesis-generating experiments.

Corollary 4: The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true ... Adherence to common standards is likely to increase the proportion of true findings. The same applies to outcomes. True findings may be more common when outcomes are unequivocal and universally agreed (e.g., death) rather than when multifarious outcomes are devised (e.g., scales for schizophrenia outcomes). Similarly, fields that use commonly agreed, stereotyped analytical methods (e.g., Kaplan-Meier plots and the log-rank test) may yield a larger proportion of true findings than fields where analytical methods are still under experimentation (e.g., artificial intelligence methods) and only “best” results are reported.

Corollary 5: The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true ... Prejudice may not necessarily have financial roots. Scientists in a given field may be prejudiced purely because of their belief in a scientific theory or commitment to their own findings. Many otherwise seemingly independent, university-based studies may be conducted for no other reason than to give physicians and researchers qualifications for promotion or tenure.

Corollary 6: The hotter a scientific field (with more scientific teams involved), the less likely the research findings are to be true ... With many teams working on the same field and with massive experimental data being produced, timing is of the essence in beating competition. Thus, each team may prioritize on pursuing and disseminating its most impressive “positive” results. “Negative” results may become attractive for dissemination only if some other team has found a “positive” association on the same question. In that case, it may be attractive to refute a claim made in some prestigious journal. The term Proteus phenomenon has been coined to describe this phenomenon of rapidly alternating extreme research claims and extremely opposite refutations. Empirical evidence suggests that this sequence of extreme opposites is very common in molecular genetics.