This is a guest post by Peter Cheng and Eliana Hechter from the University of California, Berkeley.
Suppose that you’ve had your DNA genotyped by 23andMe or some other DTC genetic testing company. Then an article shows up in your morning newspaper or journal (like this one) and suddenly there’s an additional variant you want to know about. You check your raw genotypes file to see if the variant is present on the chip, but it isn’t! So what next? [Note: the most recent 23andMe chip does include this variant, although older versions of their chip do not.]
Genotype imputation is a process used for predicting, or “imputing”, genotypes that are not assayed by a genotyping chip. The process compares the genotyped data from a chip (e.g. your 23andMe results) with a reference panel of genomes (supplied by big genome projects like the 1000 Genomes or HapMap projects) in order to make predictions about variants that aren’t on the chip. If you want a technical review of imputation (and the program IMPUTE in particular), we recommend Marchini & Howie’s 2010 Nature Reviews Genetics article. However, the following figure provides an intuitive understanding of the process.
Continue reading ‘Learning more from your 23andMe results with Imputation’
A paper out in PLoS Genetics this week takes a step towards using genome-wide association data to reconstruct functional pathways. Using protein-protein interaction data and tissue-specific expression data, the authors reconstruct biochemical pathways that underlie various diseases, by looking for variants that interact with genes in GWAS regions. These networks can then tell us about what systems are disrupted by GWAS variants as a whole, as well as identifying potential drug targets. The figure to the right shows the network constructed for Crohn’s disease; large colored circles are genes in GWAS loci, small grey circles are other genes in the network they constructed. As an interesting side note, the GWAS variants were taken from a 2008 study; since then, we have published a new meta-analysis, which implicated a lot of new regions. 10 genes in these regions, marked as small red circles on the figure, were also in the disease network. [LJ]
23andMe customers will be interested in a neat little FireFox plug-in that allows them to view their own genotypes for any 23andMe SNP mentioned on a web page. You can download the plug-in here (you’ll need to have an up-to-date version of FireFox), and I have a brief review of the tool here. [DM]
Continue reading ‘From GWAS to pathways, the consequences of DTC genetics and screening by sequencing’
In a previous post I discussed copy number variation, a form of genetic variation not broadly reported by DTC companies. In today’s post I provide a very simple program that allows one to identify potential deletions on the basis of high density SNP genotypes from a parent-offspring trio, and report on the results of running this program on data from my own family.
The program uses an approach that I applied as a graduate student to mine deletions from the very first release of data from the International HapMap Project in 2004. The idea, explained in my last post, is to look for stretches of homozygous genotypes interspersed with mendelian errors, which might indicate the transmission of a large deletion. Let’s be clear, this is a simple analysis that most programmers and computational biologists would find straightforward to implement. It is probably a good practice problem for graduate students and would-be DIY personal genomicists.
I obtained 23andMe data from both my mom and dad, and, with their consent, ran the three of us through the program. I was mildly surprised to find only two potential deletions; I had previously speculated that one would find 5-10 deletions per trio with the 550K platform used by 23andMe.
Continue reading ‘Finding the holes in our genomes’
I’m guessing everyone reading this post is familiar with recent research from Svante Paabo’s group indicating that modern humans interbred with Neanderthals during their long co-existence in Eurasia between 30,000 and 80,000 years ago. According to the researchers’ calculations, somewhere between 1 and 4% of the DNA in modern non-African humans is derived from these interbreeding events – in other words, many of us are walking around with Neanderthal DNA sitting in our genomes.
So how much of your genome is Neanderthal? Over at The Genetic Genealogist, Blaine Bettinger takes a look at the options currently available to those interested in digging for Neanderthal ancestry in their own genetic backyard. Blaine notes that one company is already offering a test labelled as looking for Neanderthal ancestry based on a limited number of variable (microsatellite) markers. However, this test doesn’t actually look directly for putative Neanderthal-derived variants; instead, it (rather quaintly) tests for “strong matches between your DNA fingerprint […] and populations identified as “archaic,” that is, whose composition retains the earliest earmarks of out‐of‐Africa genetics.” This is a very rough approach to the problem, to put it mildly.
Added in edit 15/07/10: John Hawks has a justifiably scathing review of the test on his blog; I’ve removed links to the company from this post to avoid giving them extra publicity.
People who have already had their genomes scanned by a company like 23andMe theoretically have sufficient data already available to perform a much higher-resolution analysis. However, sadly there’s not yet any readily available algorithm out there for doing this, despite there being (as Blaine notes) substantial interest for such a test from amongst the 23andMe community.
Seems like there’s some real scope for a DIY genomics tool here. Is anyone out there already working on this? Let us know in the comments.