DNA testing companies have sprouted up all over the world. They often market on the premise that they are the most up to date service available. However, the problem is that new variants in our DNA are discovered at an alarming rate.
Consider, you’ve got your 23andMe results and have enjoyed reading about your variants. However, one day, you read about a new variant that has been linked to a particular trait.
You have access to your raw data from 23andMe so you can check whether the variant is present. However, the chip does not contain the data needed to check for the variant.
At this stage, you may be thinking that the only way to detect that variant is to get another DNA test. Well, you’d be wrong.
A process called imputation can help us predict genotypes that are not identified by a test.
How Does Imputation Work?
Imputation requires two ingredients. The first is your genotype data from a genotyping chip. This is what you get in your raw data.
The second part is a reference panel. This comes from large-scale genome sequencing projects like 1000Genomes or HapMap projects.
The reference panel has data from hundreds and thousands, often millions of different people. With imputation, the software looks for a reference sample that shares SNPs with you.
It is likely that somewhere way back you share a common ancestor with this reference sample. With this information, the computer program can look for the desired variant on the reference sample and use this information to predict what your result would look like.
The process of imputing doesn’t rely on a single match with the reference sample. Instead, it creates a bit of a mosaic using pieces of reference data to match your results as closely as possible.
Imputation Programs
There are several imputation programs and software available that can make predictions based on your DNA.
Each imputation program will have its own features but they all give roughly similar results.
Some programs to check out include:
- MaCH
- BEAGLE
- fastPHASE
- IMPUTE2
Why is Imputation Useful?
Well, the major benefit of imputation is the fact that it can allow you to continuously examine your DNA and make predictions based on the most recent discoveries.
23andMe does offer update services however, they charge a fee for this. With free imputation software like the ones we’ve suggested above, you can remain updated.
Imputation software is only useful if it is accurate. You will need to research the accuracy of your chosen software. We have used IMPUTE2 and can speak to its accuracy.
In our tests, the software came back with 97.58% accuracy. This is impressive but we must also remember that even such a high level of accuracy will still result in 1 in 40 inaccuracies. It’s an unfortunate fact of the system.
The other thing to be aware of is that populations with limited reference markers will find imputation much less accurate. If there are fewer participants of your ancestry in these mass sample reference populations then it will have a harder time predicting the missing genomic data.