The ENCODE project: lessons for scientific publication


The ENCODE Project has this week released the results of its massive foray into exploring the function of the non-protein-coding regions of the human genome. This is a tremendous scientific achievement, and is receiving plenty of well-deserved press coverage; for particularly thorough summaries see Ed Yong’s excellent post at Discover and Brendan Maher at Nature.

I’m not going to spend time here recounting the project’s scientific merit – suffice it to say that the project’s analyses have already improved the way researchers are approaching the analysis of potential disease-causing genetic variants in non-coding regions, and will have an even greater impact over time. Instead, I want to highlight what a tremendous feat of scientific publication the project has achieved.
Continue reading ‘The ENCODE project: lessons for scientific publication’

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

The first steps towards a modern system of scientific publication

About a year ago on this site, I discussed a model for addressing some of the major problems in scientific publishing. The main idea was simple: replace the current system of pre-publication peer review with one in which all research is immediately published and only afterwards sorted according to quality and community interest. This post generated a lot of discussion; in conversations since, however, I’ve learned that almost anyone who has thought seriously about the role of the internet in scientific communication has had similar ideas.

The question, then, is not whether dramatic improvements in the system of scientfic publication are possible, but rather how to implement them. There is now a growing trickle of papers posted to pre-print servers ahead of formal publication. I am hopeful that this is bringing us close to dispensing with one of the major obstacles in the path towards a modern system of scientific communication: the lack of rapid and wide distribution of results.*
Continue reading ‘The first steps towards a modern system of scientific publication’

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

Guest Post: Jimmy Lin on community-funded rare disease genomics

Jimmy Cheng-Ho Lin, MD, PhD, MHS is the Founder/President of Rare Genomics Institute, helping patients with rare diseases design, source, and fund personalized genomics projects. He is also on the faculty in the Pathology and Genetics Departments at the Washington University in St. Louis, as part of the Genomics and Pathology Services. Prior to this, he completed his training with Bert Vogelstein and Victor Veculescu at Johns Hopkins and Mark Gerstein at Yale, and led the computational analysis of some of the first exome sequencing projects in any disease, including breastcolorectal, glioblastoma, and pancreatic cancers.

At Rare Genomics Institute (RGI), we have a dream: that one day any parent or community can help access and fund the latest technology for their child with any disease. While nonprofits and foundations exist for many diseases, the vast majority of the 7,000 rare diseases do not have the scientific and philanthropic infrastructure to help. Many parents fight heroically on behalf of their children, and some of them have even become the driving force for research. At RGI, we are inspired by such parents and feel that if we can help provide the right tools and partnerships, extraordinary things can be achieved.

We start by helping parents connect with the right researchers and clinicians. Then, we provide mechanisms for them to fundraise. Finally, we try to guide them through the science that hopefully result in a better life for their child or for future children. Throughout the whole process, we try to educate, support, and walk alongside families undergoing this long journey.
Continue reading ‘Guest Post: Jimmy Lin on community-funded rare disease genomics’

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

Society and the personal genome

Victory! Those of us involved in genomics research spend a lot of time thinking about how scientific and technological developments might influence personal genomics. For instance, does the falling cost of sequencing mean that medically useful personal genomics will likely be based on sequence rather than genotype data? (Yes.)

At the Sanger Institute we’ve recently launched (along with our friends at EBI) a project to look more deeply at a question which is less often on the lips of genomics boffins: “How does genomics affect as us people, both individually and in communities?” Because of the obvious resonance with Genomes Unzipped it should come as no surprise that many of us (including myself, Daniel and Luke) have been intimately involved in this initiative.

The actual line-up of events has been diverse, and a lot of fun. We’ve had two excellent debates, including one between Ewan Birney and Paul Flicek (pictured) on the value, or lack thereof, of celebrity genomes (covered in more detail here). A poet, Fiona Sampson, spent some time on campus and we’ve commissioned a book of poetry from her. This one raised some eyebrows, but I have to say that talking to her has given me some brand new ways of thinking about my own work. We’re also working on a more interactive project in the hope of making personal genomics a bit more personal. Stay tuned.

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

UK Users’ and Genetics Clinicians’ experiences of direct-to-consumer genetic testing (DTCGT)

This is a guest post by Teresa Finlay. Teresa is a PhD student at Cesagen, Cardiff University, studying with Adam Hedgecoe and Michael Arribas-Ayllon. A background in cancer nursing and a degree in human biology informed Teresa’s interest in the public’s use of direct-to-consumer genetic testing to ‘self-screen’ for disease risk. She recently secured funding from the ESRC to research users’ and genetics clinicians’ experiences of DTCGT in the UK. If you are a UK resident who has used a DTC genetic test, or a clinician whose patients have used these tests, then you too can get involved in the research.

Direct-to-consumer genetic testing (DTCGT) has been freely available on the Internet for more than five years, despite concerns from the professional community. Companies marketing these tests (such as 23andMe and deCODEme) claim they are empowering people to make healthy lifestyle choices, and frequently draw on the principle of autonomy as a central argument. This position is confirmed elsewhere by those who view genomic knowledge as an individual right, including many of the bloggers at Genomes Unzipped. Other scientists and clinicians express skepticism about the clinical validity and utility of DTCGT, and raise concerns about the potential for anxiety and inappropriate testing. These debates highlight the importance of research into the motivations and actions of DTCGC customers, but research to date remains very limited, and has mostly been performed on customers in North America. The UK, with its large state-run National Health Service and relative lack of private health insurance and providers, is likely to face unique challenges and situations as DTCGT becomes more common. The paucity of research on UK customers indicates the need for a detailed UK study examining users’ and clinicians’ perspectives in order to establish the long-term implications of DTCGT. This post outlines what is currently known about DTCGT, fills some gaps in the UK context and outlines a research project involving users and clinicians in the UK.

Continue reading ‘UK Users’ and Genetics Clinicians’ experiences of direct-to-consumer genetic testing (DTCGT)’

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

Genome interpretation costs will not spiral out of control

Mo' genomes, mo' money?

An article in Genetic Engineering & Biotechnology News argues that as the cost of genome sequencing decreases, the cost of analysing the resulting data will balloon to extraordinary levels. Here is the crux of the argument:

We predict that in the future a large sum of money will be invested in recruiting highly trained and skilled personnel for data handling and downstream analysis. Various physicians, bioinformaticians, biologists, statisticians, geneticists, and scientific researchers will be required for genomic interpretation due to the ever increasing data.

Hence, for cost estimation, it is assumed that at least one bioinformatician (at $75,000), physician (at $110,000), biologist ($72,000), statistician ($70,000), geneticist ($90,000), and a technician ($30,000) will be required for interpretation of one genome. The number of technicians required in the future will decrease as processes are predicted to be automated. Also the bioinformatics software costs will plummet due to the decrease in computing costs as per Moore’s law.

Thus, the cost in 2011 for data handling and downstream processing is $285,000 per genome as compared to $517,000 per genome in 2017. These costs are calculated by tallying salaries of each person involved as well as the software costs.

These numbers would be seriously bad news for the future of genomic medicine, if they were even remotely connected with reality. Fortunately this is not the case. In fact this article (and other alarmist pieces on the “$1000 genome, $1M interpretation” theme) wildly overstate the economic challenges of genomic interpretation.

Since this meme appears to be growing in popularity, it’s worth pointing out why genome analysis costs will go down rather than up over time:
Continue reading ‘Genome interpretation costs will not spiral out of control’

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

A review of openSNP, a platform to share genetic data

I initially came across openSNP when the team won in late 2011 the PLoS/Mendeley binary battle. This competition was open to software that integrate with Mendeley*, a suite of web and desktop tools designed to manage bibliography. So while the scope of the competition was quite broad, the winners self described their project in an interview in a way that directly relates to themes of interest to the Genomes Unzipped crew and readers. Precisely I quote:  “we try to be a community-driven platform for people who are willing to share phenotypic and genetic information for the public”. Given these aims, I decided to look into openSNP to understand what the service and aims are. I also contacted Bastian Greshake from the openSNP team who has been very helpful in answering my questions. To make a long story short, this is a fantastic idea and a great implementation, a real must-try for all users interested in the direct-to-consumer (DTC) genetic market. Keep reading for the full story.

Continue reading ‘A review of openSNP, a platform to share genetic data’

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

Another “IQ gene”: new methods, old flaws

A very large genome-wide association study (GWAS) of brain and intracranial size has just been published in Nature Genetics. The study looked at brain scans and genetic information from over 20,000 individuals, and discovered two new genetic variants that affect brain and head morphology, one which affects the volume of the skull, and one of which affects the size of the hippocampus.

The main study is very well carried out, and the two associations look to me to be well established. However, there are a few little things about the paper that, when combined with some biased reporting in the press, that have been bothering me. Firstly, the main result that has been reported in the news is that the study found an “IQ gene”, but this was only a very small follow-on in the study, and the evidence underlying it is relatively weak (certainly not the “Best evidence yet that a single gene can affect IQ”, as reported by New Scientist). Secondly, the authors use a misleading reporting of statistics to hide the fact that one of their association could easily be cause by an (already well known) association to general body size.

Continue reading ‘Another “IQ gene”: new methods, old flaws’

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

Guest post: Accurate identification of RNA editing sites from high-throughput sequencing data

[By Gokul Ramaswami and Robert Piskol. Gokul Ramaswami is a graduate student and Robert Piskol is a postdoctoral fellow in the Department of Genetics at Stanford University. Both study RNA editing with Jin Billy Li.]

Thank you to Genomes Unzipped for giving us the opportunity to write about our paper published in Nature Methods [1]. Our goal was to develop a method to identify RNA editing sites using matched DNA and RNA sequencing of the same sample. Looking at the problem initially, it seems straightforward enough to generate a list of variants using the RNA sequencing data and then filter out any variants that also appear in the DNA sequencing. In reality, one must pay close attention to the technical details in order to discern true RNA editing sites from false positives. In this post, we will highlight a couple of key strategies we employed to accurately identify editing sites.
Continue reading ‘Guest post: Accurate identification of RNA editing sites from high-throughput sequencing data’

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

Misapplied statistics in the OXTR/Prosociality story

Out in the PNAS Early Edition is a letter to the editor from four Genomes Unzipped authors (Luke, Joe, Daniel and Jeff). We report that we found a statistical error that drove the seemly highly significant association between polymorphisms in the OXTR gene and prosocial behaviour. The original study involved a sample of 23 people, each of whom had their prosociality rated 116 times (giving a total of 2668 observations), but the authors inadvertantly used a method that implicitly assumed there were actually 2668 different individuals in the study.

The authors kindly provided us with the raw data, and we ran what are called “null simulations” on their dataset to check to see whether their method could generate false positives. This involved randomly swapping around the genotypes of the 23 individuals, and then analysing these randomised datasets using the same statistical method as the paper. These “null datasets” are random, and have no real association between prosociality and OXTR genotype, so if the author’s method was working properly it would almost never find an association in these datasets. The plot below shows the distribution of the “p-value” from the author’s method in the null datasets – if everything was working properly all of the bars would be the same size:

Continue reading ‘Misapplied statistics in the OXTR/Prosociality story’

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

Page optimized by WP Minify WordPress Plugin