Misapplied statistics in the OXTR/Prosociality story

Out in the PNAS Early Edition is a letter to the editor from four Genomes Unzipped authors (Luke, Joe, Daniel and Jeff). We report that we found a statistical error that drove the seemly highly significant association between polymorphisms in the OXTR gene and prosocial behaviour. The original study involved a sample of 23 people, each of whom had their prosociality rated 116 times (giving a total of 2668 observations), but the authors inadvertantly used a method that implicitly assumed there were actually 2668 different individuals in the study.

The authors kindly provided us with the raw data, and we ran what are called “null simulations” on their dataset to check to see whether their method could generate false positives. This involved randomly swapping around the genotypes of the 23 individuals, and then analysing these randomised datasets using the same statistical method as the paper. These “null datasets” are random, and have no real association between prosociality and OXTR genotype, so if the author’s method was working properly it would almost never find an association in these datasets. The plot below shows the distribution of the “p-value” from the author’s method in the null datasets – if everything was working properly all of the bars would be the same size:

Size matters, and other lessons from medical genetics

Size really matters: prior to the era of large genome-wide association studies, the large effect sizes reported in small initial genetic studies often dwindled towards zero (that is, an odds ratio of one) as more samples were studied. Adapted from Ioannidis et al., Nat Genet 29:306-309.

[Last week, Ed Yong at Not Exactly Rocket Science covered a paper positing an association between a genetic variant and an aspect of social behavior called prosociality. On Twitter, Daniel and Joe dismissed this study out of hand due to its small sample size (n = 23), leading Ed to update his post. Daniel and Joe were then contacted by Alex Kogan, the first author of the study in question. He kindly shared his data with us, and agreed to an exchange here on Genomes Unzipped. In this post, we expand on our point about the importance of sample size; Alex’s reply is here.

Edit 01/12/11 (DM): The original version of this post included language that could have been interpreted as an overly broad attack on more serious, well-powered studies in psychiatric disease genetics. I’ve edited the post to reduce the possibility of collateral damage. To be clear: we’re against over-interpretation of results from small studies, not behavioral genetics as a whole, and I apologise for any unintended conflation of the two.]

In October of 1992, genetics researchers published a potentially groundbreaking finding in Nature: a genetic variant in the angiotensin-converting enzyme ACE appeared to modify an individual’s risk of having a heart attack. This finding was notable at the time for the size of the study, which involved a total of over 500 individuals from four cohorts, and the effect size of the identified variant–in a population initially identified as low-risk for heart attack, the variant had an odds ratio of over 3 (with a corresponding p-value less than 0.0001).

Readers familiar with the history of medical association studies will be unsurprised by what happened over the next few years: initial excitement (this same polymorphism was associated with diabetes! And longevity!) was followed by inconclusive replication studies and, ultimately, disappointment. In 2000, 8 years after the initial report, a large study involving over 5,000 cases and controls found absolutely no detectable effect of the ACE polymorphism on heart attack risk. In the meantime, the same polymorphism had turned up in dozens of other association studies for a wide range of traits ranging from obstet­ric cholestasis to menin­go­­coccal disease in children, virtually none of which have ever been convincingly replicated.
