Out in Nature this week is a paper by three Genomes Unzipped authors reporting 71 new genetic associations with inflammatory bowel disease (IBD). This breaks the record for the largest number of associations for any common disease, and includes many new and interesting biological insights that you should all go and read about in the paper itself (pay-to-access I’m afraid) or on the Sanger Institute’s website.
One thing that we did not discuss in the paper was genetic prediction of IBD (i.e. using the risk variants we have discovered to predict who will or will not develop the disease). In this post I want to outline some of the situations in which we have considered using genetic risk prediction of IBD, and discuss whether any of them would actually work in practice.
Tests given to healthy people
There is often a lag of many years from first onset of IBD symptoms to diagnosis, and up to 20% of cases go undiagnosed in older patients. Now, we cannot diagnose IBD from a patient’s genome, and in practice we never will be able to. However, can we use a genetic test for IBD variants to help improve diagnosis rates and decrease lag?
We could imagine genotyping healthy people and use our new IBD variants to find a “high risk” group that we can monitor more closely. How well would this work? Given the variants reported in the paper, the answer is “not very well”. Suppose we take people in the top 0.05% of IBD risk. Even in this high risk group only 1 in 10 people will get IBD. Even worse, 99% of real IBD patients WON’T be in this group, and so would be missed by the test!
How about if we instead introduced a check on a patient’s genome that can be queried by a GP when a patient presents with abdominal pain (one symptom of IBD). We will assume (plucked more or less out of the air) that 10% of patients with abdominal pain have undiagnosed IBD. How could a genetic test help GPs and patients? We could give some of these patients a confident “nothing to worry about” result (<1% chance of developing IBD), but only 10% patients would be told this. We could give another 7% of patients a "something to worry about" result (1 in 3 chance of developing IBD). Still, 83% of patients will get an inconclusive result, and who knows what any given GP would do with that information.
Tests given to IBD patients
Instead of testing healthy people, how about testing current IBD patients? First up, one in ten IBD patients were flagged up only as possible IBD cases when they first presented with symptoms: but for each such “possible” IBD patient there is a second “possible” patient who will turn out not to have the disease. Can our genetic test help here? Well, in practice only 1 in 50 of these patients could be told they probably have the disease (>95% chance), and a further 1 in 50 could be said to likely never get the disease. Another 40% could be classified either way with 80% certainty. So one in 25 patients would get a high confidence prediction (likely altering their treatment), and just under a half would get a balance-of-odds prediction that may be hard to interpret.
Another 5% of IBD diagnoses are overturned within the next five years. Could we predict who these people are using genetics? Looking at this, it seems the best we could do is flag up a “possibly not IBD” group containing the 10% of IBD cases with lowest genetic risk, and prioritise them for further investigation. But this would involve investigating 4 real IBD patients for every false IBD case found, and would still only catch a third of the false IBD diagnoses.
One last case: There are two major forms of IBD, Crohn’s disease (CD) and ulcerative colitis (UC). An effective surgical cure for severe UC involves complete removal of the colon. However, around 5% of UC cases undergoing this treatment turn out to be misdiagnosed CD cases: this treatment does not cure CD, and so in retrospect should not have been given to these patients. Can genetic risk prediction help to prevent this from happening? The answer, again, is ambiguous. If we stop the 10% of these operations for patients at highest risk of Crohn’s disease, we could prevent 28% of these cases. But for every 10 high-risk patients that we halted surgery for, 8 would have likely responded well to the treatment that we denied them. Is this a price worth paying?
Prediction in the real world
Those of us who work with both statistics and patient data have a dream that is sometimes called the Bayesian differential diagnosis. In this scheme a mighty algorithm would take in a patient’s symptoms and compare them against all the entries in a database of diseases. This algorithm would then produce a probability for each disease – 26% IBD, 35% bacterial gastroenteritis, 0.1% diabetes, and so forth. It would then suggest tests that can distinguish between the high probability diseases, and keep doing this until one disease was a clear winner. Think Numb3rs meets House meets Deal Or No Deal.
In this system genetic risk prediction could be easily slotted in, setting our prior expectations for all diseases based on the patient’s genome. However, this is not how our medical system works. Instead, specific tests must have specific results that lead to specific actions that alter patient care. A “2-fold increased chance of developing IBD” result does not help here unless it comes with a statement of what to do about it.
In a vacuum IBD can be predicted pretty well: genetics predicts IBD better than BMI predicts diabetes, for instance. However, the challenge is finding a situation where we have good enough prediction, used in a clever enough way, to actually improve patient care. I think we are still waiting to find the “killer app” of IBD prediction.
What a great post! Thank you for laying out so clearly the challenges involved in transferring your research findings into clinical use. It is a great example of a common problem, and one that many researchers (and journalists) simply don’t address.
Hi Luke,
Thanks for the interesting post. However, I do wonder how did you assess your predictive power? I mean, it is common knowledge in the field that individual variants explain only very little, and so the common thought is that one would have to create a complex multi-variant predictive model, that will take into account all variants, perhaps accompanied by other variables (environment etc.). Did you try to build a multi-SNP predictive model to classify cases from controls? I would love to hear more about that… Thanks!
@Ohad
Very good point – you are right that I am just giving results, without any indication of how I generated them.
I am using a multi-SNP risk score that is additive on the probit (or liability) scale, using the effect sizes given in Supp Table 2 of the paper. I then make a continuous probit-normal score that closely approximates this multi-SNP risk score, and some minor mathematical trickery can then give a predicted distribution of this risk score in cases and controls (from which all the above numbers are calculated). I haven’t actually published exactly what we do here, but it is very similar to what is described in this paper:
http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1000864
I assume that SNPs are additive and have no dominance effect, and that they each act independently, but we have tested both these assumptions and they are close enough to true to not alter the results much.
The results above don’t actually include the main classical risk factors (smoking, appendectomy and family history), and so are slight underestimates. However I have run some of these scenarios with a risk score that includes these and, while it improves things a bit, it doesn’t chance the overall conclusions much.
I have also assuming a low-cost, high-throughput genotyping test (~£8/patient) that only includes ~200 SNPs – if you spend more (~£25/patient) and assay tens of thousands of SNPs you can almost certainly do better, though I haven’t run the numbers on this.
An fantastic post – this may well replace my slide using diabetes risk scores to illustrate how GWAS common risk SNPs do not predict disease with anything approaching a high degree of specificity and sensitivity. I am also not surprised that when you add in clinical / exposure risk factors that the genetics adds little on top. I guess the real question is whether there is anything in the genome that will give a much better risk estimate, even if for a small sub group of patients. I have my own thoughts but I guess we will have to wait and see…
@John
It’s not true that genetics doesn’t add much on top of classical risk factors – unlike in diabetes, genetics predicts IBD preonset much better than anything else you can measure in the clinic. Adding in non-genetic risk factors on top of the genetic risk score gives very little increase in predictive power.
I would go for both examples. Diabetes shows a case where GWAS can’t improve classical risk scores much*, and IBD shows a case where it can improve it a lot, but is still very hard to make useful.
*though this hasn’t been tested with the most up-to-date genetic risk scores AFAIK
Thank you for a great paper!
Do you think that proposition that non-genetic risk factors don’t usually add much value to the predictive power is true for all diseases, or does it apply for the IBD only?
Luke, amazing paper. I was wondering if it were possible to employ what Peter Visscher has done with height and personality traits to IBD, that is to estimate of how much variance is explained by the common SNPs used. It appears that you guys have the combined genotypes rather than the summary statistics, though the binary trait might make it harder.
Also on your comment on prediction in the real world, maybe its time to change how medicine is practiced?
Ultimately, it boils down to the fact that only a small percentage of heritability is explained by 71 SNPs. For disease that have majority of heritability are explained, e.g. Alzheimer disease, the predictive power is very good.
Further fine mapping can improve the percentage explained. One paper suggested that doing so can double the percentage.
I also read another paper that claims, depending on some parameters, around 10% heritabily expalined is enough to achieve an AUC of 0.7 (aka the clinical utility threshold). So not all hope is lost. So your research is probably just one fine mapping away from being clinically useful.
Keep up the good work!
To Sergey, IBD is an autoimmune disease. While the exact cause this class of diseases is unknown, the current consensus is that the immune system malfunctions after contact with allergen and/or bacteria/virus. As a result, autoimmune diseases have very little to with lifestyle choices. Therefore non-genetic risk factors usually measured by the doctors don’t play much role here.
On the other hand, autoimmune diseases all seem to have high heritability. The GWASes conducted so far points to immune system related genes like the HLA genes, CTLA4, TNF-whatever, etc. Therefore they are very good candidates for GWASes.
@JeffH
Visscher and crew have already done that for IBD:
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3059431/
Variance explained by common SNPs is around 20-25%. We could use that to improve prediction quite a bit actually, but no-one has looked into that in detail AFAIK.
@GftE
The AUC for predicting IBD is much higher than 0.7 (it is around 0.8), and we can explain substantially more than 10% of heritability – and our 163 IBD risk loci explain more heritability together than ApoE does in LOAD. The idea that an AUC of 0.7 is a “clinical utility threshold” doesn’t really hold up – as you can see in this post, utility depends very strongly on exactly what you are trying to use it for.
@Sergey
That is particularly true for IBD, where environmental risk factors have been very hard to pin down with enough certainty/detail to do prediction. Many other diseases (including some other diseases of immunity) have powerful non-genetic risk factors that could be easily measured in the clinic and would probably add substantially to a risk score.
Luke,
It seems to me that “Bayesian differential diagnosis” is in fact what doctors need to do every day, maybe without the quantitative rigor? As you point out, there is no perfect test for IBD, so doctors must make decisions based on a preponderance of evidence, using reported symptoms, physical examination, lab tests, imaging, etc. If doctors are already doing this, shouldn’t they be able to incorporate one more piece of information into their (mental) models? And shouldn’t we evaluate the value of a genetic risk estimate on the same playing field as those other information sources, which also have limitations?
(I don’t think it is that simple — I don’t know how good doctors really are at differential diagnosis, when they need to combine multiple conflicting pieces of evidence like this. I suspect that in many cases the best they can do is to recognize when there is substantial uncertainty, hedge their bets, and choose additional tests that are likely to reduce that uncertainty. There could also be situations where having more information could make diagnoses worse, if doctors are not familiar with the information.)