Am I partly Jewish? Testing ancestry hypotheses with 23andMe data

I agreed to make my 23andMe genotyping results publicly available as part of GNZ without a moment’s hesitation. This is in part because I knew the results were actually a bit dull (in a good way, I suppose) – I’m not at vastly increased or decreased risk for any diseases (based on research so far), and I was unsurprised to find out that I have blue eyes. I was also unsurprised that 23andMe identified me as most likely of north European ancestry.

Several hours after we released our data, however, I was pointed to a post where Dienekes Pontikos wrote about the results of running all our data through his ancestry prediction program. While just about everyone was quite confidently predicted to be almost entirely of northwestern European descent, this analysis gave me a point estimate of 20% Ashkenazi Jewish ancestry. Within hours, several people had asked me about this, and I had no real response. So I decided to take a look at the data myself; some basic analyses are below.
Dude, where are my copy number variants?

The genome scans currently offered by major personal genomics companies provide information about only one kind of genetic variation: single nucleotide polymorphisms, or SNPs. However, SNPs are just one end of a size spectrum of variation, reaching all the way up to large duplications or deletions of DNA known as copy number variants (CNVs). Over the last decade we have learned that CNVs are a surprisingly common form of variation in humans, and they span a formidable chunk of the genome. While there are about 3M-3.5M bases of variation due to SNPs within an individual genome (in say, a typical person of European descent), there are at least 50-60M variable bases due to CNVs.

For the personal genome enthusiast with their SNP chip data from 23andMe or deCODEme in hand, there are two important practical questions: (1) can I learn about my CNVs using SNP chip data; and (2) will that information be useful?

Testing for traces of Neanderthal in your own genome

I’m guessing everyone reading this post is familiar with recent research from Svante Paabo’s group indicating that modern humans interbred with Neanderthals during their long co-existence in Eurasia between 30,000 and 80,000 years ago. According to the researchers’ calculations, somewhere between 1 and 4% of the DNA in modern non-African humans is derived from these interbreeding events – in other words, many of us are walking around with Neanderthal DNA sitting in our genomes.

So how much of your genome is Neanderthal? Over at The Genetic Genealogist, Blaine Bettinger takes a look at the options currently available to those interested in digging for Neanderthal ancestry in their own genetic backyard. Blaine notes that one company is already offering a test labelled as looking for Neanderthal ancestry based on a limited number of variable (microsatellite) markers. However, this test doesn’t actually look directly for putative Neanderthal-derived variants; instead, it (rather quaintly) tests for “strong matches between your DNA fingerprint […] and populations identified as “archaic,” that is, whose composition retains the earliest earmarks of out‐of‐Africa genetics.” This is a very rough approach to the problem, to put it mildly.

Added in edit 15/07/10: John Hawks has a justifiably scathing review of the test on his blog; I’ve removed links to the company from this post to avoid giving them extra publicity.

People who have already had their genomes scanned by a company like 23andMe theoretically have sufficient data already available to perform a much higher-resolution analysis. However, sadly there’s not yet any readily available algorithm out there for doing this, despite there being (as Blaine notes) substantial interest for such a test from amongst the 23andMe community.

Seems like there’s some real scope for a DIY genomics tool here. Is anyone out there already working on this? Let us know in the comments.

