At long last, the 1000 Genomes Project pilot paper has been published this week in Nature. The paper describes the whole-genome sequencing of 179 individuals from 4 populations, and two mother-father-child trios, looking at the whole range of genomic variation, including SNPs, small indels and larger structural variants. A total of 15m variants were called, about 8 million of which were never seen before (shown in the Venn-diagram to the right), and all the data generated (including sequence, site locations and genotypes) has been released online for anyone to use.
GNZ authors feature pretty heavily in the paper’s author list. Daniel looked for loss-of-function mutations (variants that entirely break a gene), and found about 2000. Don looked at calling de-novo mutations (mutations that occur between parent and child) from the trios, and found around 100 total, which gives a mutation rate of about 10^-8 per base per generation, or around 60 new mutations for every baby born. Luke called 2,780 variants on the Y chromosome, and put together a new Y haplogroup tree (with branch lengths!), and Jeff was involved in the validation effort.
This paper only describes the pilot phase of the 1000 Genomes Project. There is a lot more to come yet, including extending the sample size and introducing new variant calling methods. The project is going to cross the 1000th genome sequenced any day now, and eventually thousands of individuals from dozens of populations will be included.[LJ]
Congratulations everyone! This is such important work!
Hi there,
You might be interested in viewing videos of last week’s tutorial: How to Use 1000 Genomes Project Data. http://www.genome.gov/27542240
These videos describe the 1000 Genomes Project data, how to access it and how to use it. Enjoy!
Jeannine @ NHGRI
@Jeannine
Thanks for the link. Especially as the last speaker is GNZ’s own Jeff Barrett.