Rhesus, paternity tests and 23andMe

The story behind this post is that my wife recently gave birth to our first son and we experienced a funny story about genetics the day following the birth. Before I start I should say, to reassure the reader, that I have no doubt that I am indeed the father of my child. But as you will see, a non-geneticist might have become worried when faced with the same situation.

Firstly, my wife has a negative rhesus type. This has important medical implications because if the baby were to have a positive rhesus type, she would create antibodies against this marker which could be life-threatening for any subsequent child of positive rhesus type. Basically this is a relatively big deal, but there are ways to deal with this, and therefore knowing the blood type of the baby is essential.

The day after the birth, while we are both lying on our bed, very tired, a midwife comes by and asks us whether we know the rhesus status of the baby. We answer negatively, she checks her notes and says, “Ah, good news, the baby is rhesus negative. The father must also be rhesus negative then!” Well, I am not…

Finding the holes in our genomes

In a previous post I discussed copy number variation, a form of genetic variation not broadly reported by DTC companies. In today’s post I provide a very simple program that allows one to identify potential deletions on the basis of high density SNP genotypes from a parent-offspring trio, and report on the results of running this program on data from my own family.

The program uses an approach that I applied as a graduate student to mine deletions from the very first release of data from the International HapMap Project in 2004.  The idea, explained in my last post, is to look for stretches of homozygous genotypes interspersed with mendelian errors, which might indicate the transmission of a large deletion. Let’s be clear, this is a simple analysis that most programmers and computational biologists would find straightforward to implement. It is probably a good practice problem for graduate students and would-be DIY personal genomicists.

I obtained 23andMe data from both my mom and dad, and, with their consent, ran the three of us through the program. I was mildly surprised to find only two potential deletions; I had previously speculated that one would find 5-10 deletions per trio with the 550K platform used by 23andMe.

Dude, where are my copy number variants?

The genome scans currently offered by major personal genomics companies provide information about only one kind of genetic variation: single nucleotide polymorphisms, or SNPs. However, SNPs are just one end of a size spectrum of variation, reaching all the way up to large duplications or deletions of DNA known as copy number variants (CNVs). Over the last decade we have learned that CNVs are a surprisingly common form of variation in humans, and they span a formidable chunk of the genome. While there are about 3M-3.5M bases of variation due to SNPs within an individual genome (in say, a typical person of European descent), there are at least 50-60M variable bases due to CNVs.

For the personal genome enthusiast with their SNP chip data from 23andMe or deCODEme in hand, there are two important practical questions: (1) can I learn about my CNVs using SNP chip data; and (2) will that information be useful?

