"Several unplanned, post hoc analyses were performed to evaluate the failure of some Cox proportional hazards models to meet the proportional hazards assumption. These unplanned analyses included those restricted to patients who entered the study before or after publication of a widely publicized meta-analysis of rosiglitazone randomized trials on May 21, 2007,1 and partitioning of follow-up time into intervals of 0 through 2 months, more than 2 through 4 months, and more than 4 months."
Read the full JAMA article here.
Translation:
"Post-hoc analysis, in the context of design and analysis of experiments, refers to looking at the data—after the experiment has concluded—for patterns that were not specified a priori. It is sometimes called by critics data dredging to evoke the sense that the more one looks the more likely something will be found. More subtly, each time a pattern in the data is considered, a statistical test is effectively performed. This greatly inflates the total number of statistical tests and necessitates the use of multiple testing procedures to compensate. However, this is difficult to do precisely and in fact most results of post-hoc analyses are reported as they are with unadjusted p-values. These p-values must be interpreted in light of the fact that they are a small and selected subset of a potentially large group of p-values. Results of post-hoc analysis should be explicitly labeled as such in reports and publications to avoid misleading readers.
In practice, post-hoc analysis is usually concerned with finding patterns in subgroups of the sample."
In other words, Graham, et. al. tortured the data to get it to say what it wanted. And even then it found a slightly elevated risk for those on Avandia over a year period, a difference so slight that it could be easily explained by, say, severity of illness or blood sugar levels, neither of which Graham and company cared to measure.
What they did do was, after discovering no difference in risk, a post hoc subgroup analysis to find risk. That's cheating by their own admission since in the entire group they studied their were only 15,000 people on Avandia compared to 100,000 or so on Actos. But they still subdivided the two groups into two smaller groups (2-4 months on each drug and 4-6 months) and finally found what they claimed were "significant differences" in hazard ratios but only in composite scores.. And even then it was a difference of 20 percent or so. Not really statistically significant. Hey, why not test in between trips to the bathroom? It would be more fitting giving the quality of the research.
I can't believe JAMA published this nonsense with an accompanying editorial warning against use of Avandia instead of an editorial tearing about the questionable data mining.
My guess is the FDA will see right through the charade.