A very large genome-wide association study (GWAS) of brain and intracranial size has just been published in Nature Genetics. The study looked at brain scans and genetic information from over 20,000 individuals, and discovered two new genetic variants that affect brain and head morphology, one which affects the volume of the skull, and one of which affects the size of the hippocampus.
The main study is very well carried out, and the two associations look to me to be well established. However, there are a few little things about the paper that, when combined with some biased reporting in the press, that have been bothering me. Firstly, the main result that has been reported in the news is that the study found an “IQ gene”, but this was only a very small follow-on in the study, and the evidence underlying it is relatively weak (certainly not the “Best evidence yet that a single gene can affect IQ”, as reported by New Scientist). Secondly, the authors use a misleading reporting of statistics to hide the fact that one of their association could easily be cause by an (already well known) association to general body size.
Is HMGA2 an “IQ gene”?
The majority of the press around this study has been reporting that it found an “IQ gene”. However, the main part of the study didn’t look at IQ at all, only at various measured of brain size. The authors followed up their findings in a small subset of their data (1642 individuals) to see whether their two identified variants were correlated with IQ. One of them, a variant in the gene HMGA2, was found to show weak evidence of association with IQ, increasing it by an estimated 1.29 points. However, the degree of evidence for IQ association was much weaker than for intracranial volume, and could easily be a false positive (presumably why it wasn’t heavily emphasized in the paper itself).
There is further evidence that this association may not be real. The largest (I believe) genome-wide study of the genetics of IQ, published in Molecular Psychiatry last year, listed about 200 variants that showed even weak evidence of association to IQ. No variants in or near the HMGA2 gene were included on this list. Most common variants that increased IQ by more than about 1 IQ point would included on this list, suggesting that the HMGA2 either isn’t associated with IQ, or the strength of the association has been overestimated.
Combining the data from these two studies would have given over 5000 samples, which would be big enough to be properly test the association to IQ for the HMGA2 variant either way. Perhaps someone will do something like this soon – until then, I would not treat the HMGA2 as an established “IQ gene”.
Is HMGA2 actually a gene for general body size?
The variant in HMGA2 has previously been shown to be associated with height. An obvious question is whether this variant directly affected intercranial volume, or whether it just affects general body size (taller people have bigger heads). The authors state:
Structural equation modeling showed that the effect of rs10784502 [the HMGA2 variant] on intracranial volume could not completely be accounted for by the indirect effects of this SNP on height or by the correlation between height and intracranial volume
It always pays to be somewhat suspicious when a statement like this is made without any indication of how strong the statistical evidence for it is. Digging deeper into the long supplementary appendix, we find that the statistical analysis does not in fact show this at all. While a model that considered the variant to cause changes in both height and intracranial volume fitted slightly better than one where only height is directly affected, both models fit the data about as well as each other, and it is not possible to distinguish between them statistically (the p-value on the difference in models is p = 0.09). In fact, the data we see is entirely compatible with this gene only directly affecting height, and only indirectly affecting intracranial volumea
The statistics and the main text tell a very different story. It is likely that this was a simple mistake, introduced by the repeated edits that these sorts of papers always go through. However, as it stands, the main text gives a very misleading impression of what the statistics show.
New methods, old flaws
The vast majority of this paper follows the laid down protocols for a high-quality genome-wide association study. The sample size is very large, population stratification and other confounders are well controlled for, and tough standards for strength of evidence were used. This is the legacy of GWAS: a study must be well powered, well performed and stringent, or it is worthless.
But beyond the GWAS portion of the study, these standards of evidence are loosened significantly. A third association near the gene gene DDR2 is (rightly) described as “suggestive” in the abstract, because in a GWAS framework the evidence is not considered strong enough (it has p = 5 x 10^-7, whereas we require p < 5 x 10^-8). Contrast that to the way that the IQ association is treated, where much weaker evidence is taken as indicating an association, or in the height vs intracranial volume test, where the p-value isn't even stated in the main text (presumably because the actual evidence is too weak for anyone to believe). GWAS standards are not there because GWAS are somehow more prone to false positives if not handled properly (they are in many ways less prone to them). There is no point in keeping rigorous standards of evidence if you are going to start breaking them during follow-up, and it does the field a disservice if your genome-wide-significant brain size association ends up alongside your nominally-significant IQ association in a news report. The image for this post is, somewhat unusually, a structural MRI of my own brain.