Estimating heritability using twins

Last week, a post went up on the Bioscience Resource Project blog entited The Great DNA Data Deficit. This is another in a long string of “Death of GWAS” posts that have appeared around the last year. The authors claim that because GWAS has failed to identify many “major disease genes”, i.e. high frequency variants with large effect on disease, it was therefore not worthwhile; this is all old stuff, that I have discussed elsewhere (see also my “Standard GWAS Disclaimer” below). In this case, the authors argue that the genetic contribution to complex disease has been massively overestimated, and in fact genetics does not play as large a part in disease as we believe.

The one particularly new thing about this article is that they actually look at the foundation for beliefs about missing heritability; the twin studies of identical and non-identical twins from which we get our estimates of the heritability of disease. I approve of this: I think all those who are interested in the genetics of disease should be fluent in the methodology of twin studies. However, in this case, the authors come to the rather odd conclusion that heritability measures are largely useless, based on a small statistical misunderstanding of how such studies are done.

I thought I would use this opportunity to explain, in relative detail, where we get our estimates of heritability from, why they are generally well-measured and robust, and real issues need to be considered when interpreting twin study results. This post is going to contain a little bit of maths, but don’t worry if it scares you a little, you only really need to get the gist.

The Standard GWAS Disclaimer

The first thing to say, and this argument is really starting to get old, is that it is crazy to call GWAS an unqualified failure, regardless of how much heritability is left unexplained, and how good prediction is or is not. Many diseases have had a massive boost from the dozens of new variants associated with them via GWAS; even if they cannot effectively predict risk, they do effectively shed light on the molecular biology of the disease. Beyond this, GWAS results are also shedding some light (finally!) on how non-coding variation leads variation in phenotype, via very interesting studies that are tying GWAS results into high-throughput non-coding annotation (e.g the ENCODE project). We can also add Mendelian randomisation to the mix. Even if GWAS results aren’t (yet) directly applicable in the clinic, we will be seeing improved treatment and care in the form of better understanding of the diseases, including their environmental, molecular and genetic causes. In fact, the structure of disease risk that GWAS has uncovered (a large number of low effect variants) turns out to be far more useful for medical science, if less useful for clinical practice.

Now that that is out of the way:

How we calculate heritability

We can measure the variance in a trait (we will call it variance in liability, L, and assume that it corresponds to a normally distributed variable) as a mixture of different effects: variance due to genetics (which we will call A, for “additive“), and variation due to environment. We can express this as:

L = A + E

The heritability, which is called h2 is the proportion of the total variance that is genetic. It is thus given by the equation:

h2 = A/(A + E)

The authors of the Bioscience Resource Project post assume that A and E are both measured within families, and thus E is a major underestimate of the environmental variance in the population. This is not what we do: in fact, E is measured between families, and thus is a good estimate of the population variance, assuming that our families are representative of the population. The BRP’s claim that heritability values are inflated is a direct result of this statistical mixup.

So how do we measure these values? As both genetics and environment vary between families, the variance between families is A + E. We can measure A from identical (monozygotic, or MZ) twins, by assuming that they have perfectly correlated genetics, but non-correlated environment, so the shared variance (the Covariance) is A. The heritability can be calculated using this equation:

h2= A / (A + E)
= [covariance within MZ twinships]/[variance between families]

Note that above, we assumed that MZ twins do not share a common environment; this is a bad assumption, because often they will (duh). So, instead, we model the liability as having some shared environmental component C (for common), so that

L = A + E + C

Assuming monozygotic and dizygotic twins share the same environment (a much less dodgy assumption), the covariance between monozygotic twins is A + C, and between dizygotic twins is 0.5 x A + C (as they have the same environment, but half the same DNA). We can thus calculate the heritability using

h2= A / (A + C + E)
= 2 x ([A + C] – [0.5 A + C]) / (A + C + E)
= 2 x ([Covariance within MZs] – [Covariance within DZs]) / [Variance between families]

So now we can calculate the heritability, taking into account shared environment, as a proportion of the variance in the population as a whole. All of this has been done assuming a normally distributed continuous trait (like height), but we can use something called liability threshold modeling to study yes-no binary traits (like “does he have diabetes?”), which works in pretty much the same way.

The heritability estimates can be turned into best case scenarios for prediction, which is how we know that for many traits, prediction could, if we could properly model all genetic risk, could be more clinically useful than standard predictors. We can also use the heritability to find out how much of the genetic effect (A) we have accounted for with our GWAS results; for virtually all diseases we find that the majority of genetic risk is still left undiscovered.

We do have to take into account of the fact that error bars from twin studies for rare diseases tend to be pretty large, due to the inability to find enough twins with the disease. For example, in Crohn’s disease we generally find error bars that place h2 between 40 and 80%, so our estimate of how much heritability we captured with the 71 variants of our latest meta-analysis varies from 16-32%. Either way, we know that there is still a lot of heritability left to find.

Some (Less Specious) Objections

There are genuine criticisms of the method of calculating heritability that you need to consider before you take heritability estimates at face value.

  1. We assume MZ twins share no environment that DZ twins do not also share. They both share age, family environment and intrauterine environment, but MZ twins also [edit: usually] share a placenta, and a somewhat different social environment (MZ twins may copy each other more, or be more likely to deliberately develop their differences). However, comparisons between twin studies and siblings-raised-apart studies in easy-to-collect traits like height show broadly the same heritability, suggesting that this effect probably isn’t a major problem.
  2. We assume that we can disregard gene/environment interaction, which can have complicated twin-sharing properties. One potential worry is that selecting twins with rare diseases could select for twins from a high risk environment, and there is some evidence to suggest that disease progression in high-risk individuals is more genetically determined than in low-risk individuals (e.g. odds ratios for obesity are larger for low-activity individuals)
  3. We assume that DZ twins share half the genetic effect, i.e. that the correlation in risk is equal to the number of shared alleles (no gene-gene interaction): if this is false, heritability can be overestimated. We can get around this by extending the study to include other family members (e.g. a children-of-twins study), or by looking at twins raised apart and raised together, which allows us to model dominance effects.

Note that the assumption that the twins are representative of the general population is a big one, though in general people tend to go to lengths to make sure the twin datasets are representative (e.g. the Swedish and Australian twin registries aim to sample all twins born in their respective countries), and you can compare between twin registries to find how robust they are. However, you should be careful extrapolating heritability between populations (e.g. extrapolating heritability from European twin studies to African populations), and you should definitely not be using heritability within populations to try and infer genetic differences between populations. The general rule is that heritability only applies to a particular population, with its own values of E, A and C, and there is no simple “heritability” that can be defined independent of these different variances.

For a nice review of heritability in humans and animals, that discusses many of these issues in more detail, see this paper from Peter Visscher and crew. Interestingly, the Bioscience Resource Project post cites this paper, which makes their mistake somewhat surprising.

The imagine above is of the Carlsons Twins, two notably concordant (if you know what I mean) MZs. It is taken from their wikipedia article.

  • Digg
  • StumbleUpon
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

17 Responses to “Estimating heritability using twins”

  • Thanks for the clear explanation about heritability. This site is becoming the default resource for clarity.

    Re the silly Bioscience Resource article, it was not just “death of GWAS” but death of genetics full stop. It was nicely done over at the OpenHelix blog

    I commented there:

    re the article, it’s too too long and it reminds me of the streets of Napoli, near where I live, as the world will know, they are full of rubbish.

    This short quote from the article says most of what needs saying: “genetic predispositions (i.e. causes)”

    Melanoma is a complex disease, it’s caused by UV from the sun isn’t it? But why don’t all those Africans and Australian aborigines get it then? Ah yes, it’s caused by those white skin genes, sorry. Hang on maybe it involves both, maybe white skin is a genetic predisposition (bravo Homer…)

    It’s a totally stupid article and the authors must surely know it. I believe that they do have an agenda rather than being totally stupid themselves. Maybe they are quite aware of the distortions and plain untruths but they justify it to themselves because they think genetic determinism is too strong. Indeed some scientists and many journalists are guilty of exaggerating the role of genes but that’s no excuse.

    In their resources section they link to the UK’s Genewatch – now they are gene deniers par excellence, have been for a decade or so. I guarantee that in almost all their comments on commercial genetics there will be the phrase “the marketing of fear”

    So NO their mission is not: “To provide the highest quality scientific information and analysis to enable a healthy food system and a healthy world.”

    It’s something else

  • Thanks very much for that and for the link to Visscher’s paper, which I am looking forward to reading.

  • Nice job, Luke. Regarding your closing point about whether twins are representative of populations, there’s at least some evidence to support this. Visscher has another nice paper where they estimate the heritability of height based on the relationship between sibling height correlation and actual genome-wide identity by descent. In other words, sibs are expected to share half their genomes, but there’s some variance around it, and you can estimate heritability by seeing if sibs who are genetically more related have more similar heights than those who are genetically less related. The estimate they come up with is pretty similar to the twin study estimates, which is reassuring.

  • @Jeff

    Kate just pointed me towards that paper. It is pretty damn awesome. Also, it answered a question I have wanted to know that answer to for a while – what is the range of relatedness between siblings (apparantly, the 95% range is 43% – 57%).

    The paper is here:

  • @Keith

    In their resources section they link to the UK’s Genewatch – now they are gene deniers par excellence, have been for a decade or so. I guarantee that in almost all their comments on commercial genetics there will be the phrase “the marketing of fear”

    Certainly “marketing of fear” features on their page on Genetic Horoscopes – but only after a section that says:

    Genes are poor predictors of common complex diseases in most people and targeting a minority of ‘genetically susceptible’ individuals is usually a poor health strategy. The health impacts of smoking, poor diets, poverty and pollution are not limited to individuals with ‘bad genes’ and require population-based preventive strategies (such as providing better sports facilities, healthier school meals and banning fast food ads to children).

    Anyone here disagree with any of that?

    And their top topic today, as it is most times I’ve looked, is the overly inclusive UK Police DNA Database – see:

    So, while one might disagree with the strident tone of much/most of their output, there is good civil liberties work being done here too.

  • @Neil – yes quite right, not many would disagree with that statement. Many of the things that they say are sensible but they really ruin their own case with the dogmatic anti-genetics. They are anti pharmacogenetics as well, not just the DTC type stuff of 23andMe et al.

    Maybe that is the real pity, they have an opportunity to be useful but they miss it with their blanket anti- stance, a bit like the GAO missed it’s opportunity recently. They could make some attempt to separate the serious and generally not over-hyped market from the real scams and fraudsters. They could also appreciate that none of the serious companies are claiming that personal genetics should be used as general population based preventive strategies. Most agree that it’s “not ready for prime time” in the sense that prime time means universal reimbursed screening and inclusion as part of public health initiatives.

    By being so dogmatic their sensible positions lose a lot of their impact, it’s impossible to debate with them

  • Wow Neil I just went back and read that horoscope stuff again. The first bullet point is as you say, OK. the rest though… it’s mostly alarmist nonsense:

    – Promoting genetic testing wrongly implies that reducing pollution, smoking or obesity is important only for a minority of people.
    – Creating a genetic underclass
    – Undermining civil liberties
    – The patenting of life
    – Wasting resources and eroding trust
    – distorts the health research agenda

    and they want to stop me from knowing my own genotype just out of pure curiosity

    Anyone disagree with any of that?

  • John Holloway

    I loved the post. The best bit for me was the “standard GWAS disclaimer”. It is something I always feel I must highlight when speaking to clinical audiences, that genetic studies will primarily provide insight into disease biology, (as they are like a big natural experiment in humans on subtly altering gene expression/protein function), and prediction of disease may (or may not depending on your perspective) be another outcome (that again may or may not be of clinical use depending on disease and specificity/sensitivity).

    The big question is of course is where is that missing heritability coming from……. must get back to the lab.

  • Ah – so this is the connection between Genewatch and twin studies – and which I’d like to re-iterate does not invalidate the civil liberties work:

    and it is close to full-blown DNA denialism:


    The results show that the potential for reducing the incidence of common diseases using environmental interventions targeted by genotype may be limited, except in special cases. The model also confirms that the importance of an individual’s genotype in determining their risk of complex diseases tends to be exaggerated by the classical twin studies method, owing to the ‘equal environments’ assumption and the assumption of no gene-environment interaction. In addition, if phenotypes are genetically robust, because of epistasis, a largely environmental explanation for shared sibling risk is plausible, even if the classical heritability is high. The results therefore highlight the possibility – previously rejected on the basis of twin study results – that inherited genetic variants are important in determining risk only for the relatively rare familial forms of diseases such as breast cancer. If so, genetic models of familial aggregation may be incorrect and the hunt for additional susceptibility genes could be largely fruitless.

    Written in 2006, just pre-GWAS.

  • @Luke

    I thought I’d reply to your post since it concerns our article The Great DNA Deficit

    Your reply is more or less the standard heritability defence and our contention is that its wrong. I say more or less because it is wrong of you to say you ‘measure’ environment. In a GWAs study of height (for example), one measures height (and relatedness). You dont actually measure environment (what would you measure?). The estimate of environment results effectively from what’s left over. This matters for the following reason:

    You have nevertheless identified roughly the difference between us. We think the estimate of environment is too low (as explained in the article) and this is a reputable argument not resolvable by arguing we have made a ‘statistical mistake’. Presumably you think Francis Collins and the other Manolio et al. authors (Finding the missing heritability ) also are guilty of a ‘statistical mistake’ when they write that:

    Many explanations for this missing heritability have been suggested, including……. inadequate accounting for shared environment among relatives.

    The bottom line is no-one is finding the heritability and its beyond time to ask if it’s an artefact.

  • @JRLatham

    The heritability is defined as the proportion of total phenotypic variance in the population that is attributable to genetic variance. We measure the variance from the population, i.e. the variance between families, and the genetic variance from within families, exactly as outlined above. Your statement on your blog, “only the variation within each twin pair is actually being measured”, is a transparently clear mistake.

    Estimating heritability does not involve a direct measurement of environment at all, it only involves measuring the amount of variation in the population.

    There are many legitimate discussions that can be had about twin studies, and variance component modeling, such as the ones I discuss above. I expect that the authors of the paper you link to there are referring to some such discussion (probably how to properly model gene-environment interaction, mentioned above).

    It is, and indeed has been for a while, the right time to discuss twin studies and estimates of heritability, and whether they are overestimated. However, this is not what you are doing: all you have done is drawn a conclusion based on a simple misunderstanding of how one measures variation.

  • A very useful overview of heritability and some nice followup discussion points. Thanks for the tips on the Visscher papers. Eric Lander is arguing that gene-gene interactions (including epistasis) can account for substantial amounts of missing heritability with common variant causes. Probably true in part, but he seems also to have some bias against rare variants unnecessarily. Mutation rates are extremely high, we all carry several hundred private, heterozygous coding variants (i.e. each extremely rare) which could easily include high penetrance alleles affecting these common trait variables.

  • Alex Stoddard


    Thank for a simple and clear explanation of heritability from twin studies.

    It does appear to me that the “missing heritability” in GWAS studies strikes at the hypothesis that common variants acting additively are the explanation for human disease risk. But I am coming to the conclusion that that hypothesis has always been naive. And conflating the failure of common variant additive effects with the failure of medical genetics and the study of heritability is specious at best.

    Am I correct in thinking that MZ twins share not only all their individual genetic variants (ignoring epigenetics for now) but also all the possible interactions between these variants?

    Would extensive gene-gene interaction be a valid hypothesis for the apparent discrepancy between GWAS effect sizes and the heritability measured in twin studies in your opinion?

  • @Alex

    Yeah, epistatic effects could account for some missing heritability, though it depends very much on how the heritability was estimated in the first place. Many heritability studies take into account a dominance term D, which includes both dominance at one loci and pairwise epistatis, and these studies often only include this in “broad sense heritability” (with strictly additive variance alone being called the “narrow sense heritability”). If pairwise epistatic interactions contribute a variance D to liability, they give a covariance D/4 to DZ twins (two probability 1/2 IBD sharing events). If you don’t account for it, therefore, each unit of dominance variance acts like 3/2 units of additive heritability (2*(D – D/4) = 3D/2), so a purely dominant heritabilility will be estimated as if it was an addative heritability 50% larger. We know that nothing that extreme exists, but smaller effects might lead to subtle inflations of heritability.

    So yes, I would guess that epsistasis is both leading us to miss some GWAS heritability (through not fully getting the risk model), and overestimate some twin study heritability. However, I am somewhat sceptical about there being massive amounts of heritability hiding in epistatis, as the known GWAS loci have stuck extremely closely to additivity so far, as I wrote about at ASHG this year:

  • I hope this back-and-forth does not devolve too much further. It’s really great to have a debate about the notion of heritability, which does no end of mischief precisely because of the forced choice nature of the statistics, which assume genetic and environmental factors add linearly. We of course know they do not. PKU, the original human genetic disease, is highly malleable in its *effects* depending on diet. Likewise sickle cell. The traits are inherited, but the effects of those inherited factors are wildly different depending on how people live, what treatment they get, what they eat, etc. And those are the simple cases we can explain. Most complex disorders are going to involve many gene-environment-behavior-social factor interactions. The point is that genetic methods can get at the inherited factors, and that can be a good thing, esp. if having an accurate diagnosis then influences the treatment plan. There’s ‘dark matter’ in the genetics, and we should try to find it. The chief limitation of GWAS is not a flaw so much as a just that–a limitation, because current GWAS is most often now based on shared common variants that can be put on SNP chips. That should change with full-genome sequence analysis, which will surely turn up additional clues about the inherited components (and somatic mutations by doing within-person comparisons across time or among different tissues).

    The *really* hard work will be figuring out the interactions with behavior, environment (the residual), and social factors that affect health (and biology). But we don’t have to divide into camps about which matters more, genes or environment, when in most diseases in most conditions, both matter.

    The point is to use whatever tools we have to figure out how. The tools on the genetics side are going to advance faster and drop in cost faster than the other tools, which suggests that part of the science will move faster, but not that it is ore complete or more important.

    For those who want to look at the roots of this controversy, Fisher’s original paper is really interesting reading, and you can see how he thought about “environment” as a residual, not with any specificity, and not really as part of a causal pathway; he was mainly interested in dissecting out the genetic factors from everything else, assuming they were the bottom layer of a causal pyramid, and he called everything else “environmental.” His article is available online:
    Fisher, R. A. (1918). “The correlation between relatives on the supposition of Mendelian inheritance.” Transactions of the Royal Society of Edinburgh, 52, 399–433. Retrieved from

  • how would we calculate combined heritability(BS) in two environments having 3 replications in each environment. i mean what would be the formula/equation? I am Agriculturist.

  • I’ll right away snatch your rss as I can’t in finding your e-mail subscription link or e-newsletter service. Do you’ve any? Kindly let me understand so that I could subscribe. Thanks.

Comments are currently closed.

Page optimized by WP Minify WordPress Plugin