The recent announcement of a new journal sponsored by the Howard Hughes Medical Institute, the Max Planck Society, and the Wellcome Trust generated a bit of discussion about the issues in the scientific publishing process it is designed to address—arbitrary editorial decisions, slow and unhelpful peer review, and so on. Left unanswered, however, is a more fundamental question: why do we publish scientific articles in peer-reviewed journals to begin with? What value does the existence of these journals add? In this post, I will argue that cutting journals out of scientific publishing to a large extent would be unconditionally a good thing, and that the only thing keeping this from happening is the absence of a “killer app”.
Author Archive for Joe Pickrell
Page 2 of 3
Recently, Luke reported that I am a carrier of the E4 allele at the gene APOE; this gives me approximately double the average risk for late-onset Alzheimer’s disease. I didn’t think too much about this–it’s only double the risk, and in any case I’m 28 years old. But I recently came across the below plot, by Nick Eriksson (I’ve re-plotted it). It shows the frequency of the APOE4 allele plotted against average age in 15 cohorts of “cognitively normal elders” (data from here).
If we assume that these 15 cohorts are all from relatively similar populations, the interpretation of this is that, between the ages of 70 and 85, people with my genotype go from being cognitively normal elders to not (due to Alzheimer’s, another form of dementia, or death) at a rate about twice that of people who don’t carry the E4 allele . This, of course, is exactly what I knew before (that E4 carriers have double the risk of Alzheimer’s), but seeing this visually is quite striking.
 Could the drop in APOE4 allele frequency could be mostly due to E4/E4 homozygotes (i.e. people not of my genotype)? If we assume an initial allele frequency of 20% and Hardy-Weinburg equilibrium, then a fifth of the APOE4 alleles are present in homozygotes. So even if all of these individual developed Alzheimer’s, then this would drop the allele frequency from 20% to ~16%. The observed drop in allele frequency is much greater than that.
UPDATE 3/17/12: A more extensive analysis of the paper discussed in this post is here. Several groups have concluded that at least 90% of the sites identified are technical artifacts
The “central dogma” of molecular biology holds that the information present in DNA is transferred to RNA and then to protein. In a paper published online at Science yesterday, Li and colleagues report a potentially extraordinary observation: they show evidence that, within any given individual, there are tens of thousands of places where transcribed RNA does not match the template DNA from which it is derived . This phenomenon, called RNA editing, is generally thought to be limited (in humans) to conversions of the base adenosine to the base inosine (which is read as guanine by DNA sequencers), and occasionally from cytosine to uracil. In contrast, these authors report that any type of base can be converted to any other type of base.
If these observations are correct, they represent a fundamental change in how we view the process of gene regulation. However, in this post I am going to point out a couple of technical issues that, if not properly taken into account, have the potential to cause a large number of false positives in this type of data. The main point can be summarized like this: RNA editing involves the production of two different RNA and/or protein sequences from a single DNA sequence. To infer RNA editing from the presence of two different RNA and/or protein sequences, then, one must be very sure that they derive from the same DNA sequence, rather than from two different copies of the DNA (due to, for example, paralogs or copy number variants). Although this issue has the potential to be a large source of false positives in a study like this, I will discuss an additional technical problem that could also result in false positives.
Over the last several years, the number of genetic variants unambiguously associated with disease risk has grown dramatically. However, interpreting these signals has been extremely difficult—most of the identified variants do not disrupt genes, and indeed many don’t fall anywhere near genes (this observation has even led some to discount these signals entirely). To an investigator interested in following up on these signals, this is somewhat depressing: how can we hope to explore how polymorphisms affect disease risk if they don’t seem to fall in any sort of genome annotation that we understand?
In this context, I thought I’d point to an important paper that, among many other things, gives the first systematic evidence that variants which influence disease are not just randomly scattered across the genome, but instead tend to fall in particular regions—in particular, enhancer elements (regions where DNA-binding proteins interact with DNA to influence gene expression).
The authors rely on the fact that, in the cell, DNA is wrapped around proteins called histones, which control how accessible the DNA is to things like transcription factors (see above figure). These proteins can be chemically modified, and it is now clear that particular patterns of modifications are predictive of the function of the DNA in the region—some modifications indicate transcribed genes, others regions of enhancer activity, others repressed regions, etc.
What the authors did in this study was generate genome-wide maps of several histone modifications in nine different cell types, and use this data to predict the function of each 200 base pair segment of the human genome in each cell type. There are a number of interesting analyses of these “maps” of genome function in the paper, but for our purposes here there’s one of particular interest: the authors took sets of SNPs associated with various diseases and simply asked, are these variants enriched in regions with any particular functional prediction? And indeed, for several phenotypes, there is a striking enrichment of association signals in enhancers elements in a relevant cell type. For example, SNPs which influence lipid levels are enriched in enhancers in a liver cancer cell line, and SNPs which influence the autoimmune disease lupus are enriched in enhancers in a lymphoblastoid cell line.
As these types of functional maps are generated in more cell types, I imagine there will be more stories like this. The problem with interpreting disease association studies, it seems likely, is largely due to our lack of understanding of genome function.
Citation: Ernst et al. (2011) Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. doi:10.1038/nature09906
This week has seen another FDA meeting seeking guidance on how to regulate direct-to-consumer (DTC) genetic tests in the US. The meeting itself has been covered by GNZ bloggers Daniel at Genetic Future and Dan at Genomics Law Report, and its apparent outcome has sparked furious debate elsewhere. The discussion among the “independent” panel convened at the meeting appeared to converge on the proposal that all health-related genomic tests should be ordered and reported through physicians. However, the outcomes of the meeting in terms of FDA policy remain unclear, and one FDA official has indicated that decisions about the availability of genetic tests will be made on a test-by-test basis.
There is no doubt that the appropriate regulation of personal genomics tests is a complex issue, and there is a diversity of opinion about how best to achieve it within GNZ (as there is throughout the genomics community). However, there are several points we agree on:
- Individuals have a fundamental right to access information about themselves, including genetic information. While it is important to also consider the accuracy, interpretation, validity and utility of tests, this underlying principle should guide policy.
- There is currently no evidence that DTC genetic tests pose a danger to consumers. A recent study of over 2,000 participants in DTC testing concluded that “testing did not result in any measurable short-term changes in psychological health”. In the absence of any evidence of harm there is no justification for restricting individual autonomy.
- DNA does not have magical powers, and does not require special treatment simply by virtue of being DNA. Genetic exceptionalism – the idea that genetics must be treated as special under the law – is an inappropriate basis for policy-making. Tests should be regulated appropriately based on their predictive power, utility and potential for harm, all of which are related concepts.
- As DNA sequencing becomes cheaper, the line between medical and non-medical testing will continue to blur. Excessive regulation of health-related genetic tests could also unncessarily hinder the ability of people to access their entire genome sequences for other purposes (such as genetic genealogy).
- Most clinicians do not have the appropriate knowledge to interpret genomic tests, particularly in healthy individuals. This point is almost universally agreed, even by the FDA, and has certainly been the experience of some of the GNZ members upon taking our genetic results to doctors. Physicians in general are therefore a strange choice for ‘guardians of the genome’.
- Most early adopters of DTC genetic tests are sufficiently well-informed to understand the implications of a genomic test and interpret the results correctly. Putting a general physician between these informed individuals and their own genomes is paternalistic and unnecessary.
While the outcome of the FDA’s deliberations remain uncertain, it is clear that there will be intensive lobbying against any attempt at excessive legislation. In the worst case scenario, the fledgling and innovative personal genomics market could be crushed by the FDA. However, there is still plenty of room for a measured approach that enforces test accuracy, punishes false claims and promotes informed choices by consumers, without reducing the ability of responsible companies to continue to operate and innovate.
We urge others in the genomics community to make their voices heard on these issues. Let the FDA – and, if you’re based in the USA, your political representatives – know that regulation of genetic testing should be based on evidence, not fear, and that any attempt to unreasonably restrict your access to your own genetic information is unacceptable.
I’ve been reading with interest Daniel’s coverage of the recent FDA hearings into DTC genetic testing. In this context, both he and Razib Khan are incensed by a video which seemingly shows an FDA official misleading Congress about the research done by 23andme:
You can think what you want about the value of the research done to date by 23andme , but in my mind, there’s one simple reason why the sorts of participant-driven research they’re doing can only be a good thing: all research is driven by curiosity, and the people most curious about a disease or trait are those who have it. While people may think of the academic research community as a machine with endless resources and limitless motivation, it’s not. People work on things they think are interesting; they sometimes follow “trendy” topics, or move into fields with more grant money, or get bored of a given problem and move on. So if the research in the trait you’re most interested in isn’t moving fast enough for you, well, tough luck.
Recall that one of the key players in the discovery of the gene for Huntington’s disease was a foundation started by a man whose wife had the disease (startlingly, the current president of the foundation apparently accused DTC companies of “raping” the human genome during the present FDA hearing). Recall also that James Lupski, curious about the cause of his Charcot-Marie-Tooth disease, simply sequenced his own genome to find it. These are simply well-connected and trained people driven to find a gene involved in a disease. Patient communities that currently exist are also curious and driven, but in many cases are dealing with complex diseases that are amenable to genetics only with large sample sizes and extensive organization; what these communities can now do is outsource, in a sense, their research to 23andme (see, eg., 23andme’s Parkinson’s study). For scientific knowledge, this can only be a good thing.
 To date, the novel associations discovered by 23andme are in hair morphology, freckling, photic sneeze reflex, and “asparagus anosmia”. What these things have in common is that they’re biologically interesting, but not particularly medically interesting; it’s pretty much only curiosity that would drive you to map these traits. Medical researchers tend to scoff at this sort of thing; I think it’s actually pretty cool.
To celebrate the end of the blogging year here at Genomes Unzipped, we wanted to spend a bit of time reminiscing about the papers we enjoyed the most in 2010. Feel free to add your own suggestions in the comments!
Joe: Mice, men, and PRDM9. A key goal in evolutionary biology is to identify the mechanisms leading to speciation. One way to get at that goal is to identify genes that cause sterility or reduced fitness in hybrids between species or diverged populations. In mammals, exactly one such gene has been identified to date: the DNA-binding protein PRDM9. This year, three groups working on a seemingly different problem–deciphering the molecular mechanisms by which recombination shuffles genetic variation between generations–stumbled across an important gene in this process: PRDM9. Variation in this gene influences recombination patterns in both mice and humans, and is responsible for the dramatic differences in recombination patterns between humans and chimpanzees. Is it a simple coincidence that a gene which influences recombination also appears to have a role in speciation? Time will tell.
Parvanov et al. (2010) Prdm9 Controls Activation of Mammalian Recombination Hotspots. Science. DOI: 10.1126/science.1181495.
Baudat et al. (2010). PRDM9 Is a Major Determinant of Meiotic Recombination Hotspots in Humans and Mice. Science. DOI: 10.1126/science.1183439.
Myers et al. (2010). Drive Against Hotspot Motifs in Primates Implicates the PRDM9 Gene in Meiotic Recombination. Science. DOI: 10.1126/science.1182363.
Daniel: Whole-genome sequencing to develop personalised cancer assays. The area of medicine where the transforming power of new DNA sequencing technologies is moving the fastest is in cancer diagnostics and therapy. There were many studies relevant to this field in 2010 (with a fair proportion featuring on the excellent MassGenomics blog), but this paper was a simple, elegant example: the authors performed low-coverage whole-genome sequencing of four tumour samples, identified large genomic rearrangements present in the tumour cells but not in the patient’s healthy tissue, and then designed personalised, quantitative assays measuring the proportion of cells carrying these rearrangements in the patients’ blood. These assays allowed them to track, almost in real time, how the patients’ cancers responded to various therapies, like so:
Leary et al. (2010) Development of personalized tumor biomarkers using massively parallel sequencing. Science Translational Medicine. DOI: 10.1126/scitranslmed.3000702.
Continue reading ‘Our favourite papers of 2010’
Though this site is largely dedicated to discussions of personal genomics, I’d like to use this post to discuss some of my recent work (done with Athma Pai, Yoav Gilad, and Jonathan Pritchard) on mRNA splicing. Our paper, in which we argue that splicing is a relatively error-prone and noisy process, has just been published in PLoS Genetics .
In my last post, I discussed how I used 23andMe data to test hypotheses about my ancestry. In particular, I was intrigued by Dienekes Pontikos’s result suggesting that I (and my colleague Vincent) might be partly Ashkenazi Jewish. Ultimately, however, I concluded that his algorithm was not properly modeling my southern European ancestry (inherited from one Italian grandparent), and that this was leading to a spurious result.