Genome interpretation costs will not spiral out of control

Mo' genomes, mo' money?

An article in Genetic Engineering & Biotechnology News argues that as the cost of genome sequencing decreases, the cost of analysing the resulting data will balloon to extraordinary levels. Here is the crux of the argument:

We predict that in the future a large sum of money will be invested in recruiting highly trained and skilled personnel for data handling and downstream analysis. Various physicians, bioinformaticians, biologists, statisticians, geneticists, and scientific researchers will be required for genomic interpretation due to the ever increasing data.

Hence, for cost estimation, it is assumed that at least one bioinformatician (at $75,000), physician (at $110,000), biologist ($72,000), statistician ($70,000), geneticist ($90,000), and a technician ($30,000) will be required for interpretation of one genome. The number of technicians required in the future will decrease as processes are predicted to be automated. Also the bioinformatics software costs will plummet due to the decrease in computing costs as per Moore’s law.

Thus, the cost in 2011 for data handling and downstream processing is $285,000 per genome as compared to $517,000 per genome in 2017. These costs are calculated by tallying salaries of each person involved as well as the software costs.

These numbers would be seriously bad news for the future of genomic medicine, if they were even remotely connected with reality. Fortunately this is not the case. In fact this article (and other alarmist pieces on the “$1000 genome, $1M interpretation” theme) wildly overstate the economic challenges of genomic interpretation.

Since this meme appears to be growing in popularity, it’s worth pointing out why genome analysis costs will go down rather than up over time:

Genome analysis will become increasingly automated

Right now, anyone who wants to provide a thorough clinical analysis of a genome sequence needs to prepare for some serious manual labour. After finding all of the possible sites of genetic variation, a clinical genomicist needs to identify those that are either known disease-causing variants or are found in a known disease gene, then check the published evidence supporting those associations, discuss their significance with clinical experts, perform experimental validation, and then generate a report explaining the findings to the patient and her doctor.

That’s all time-consuming stuff. But with every genome that we analyse, we get better at automating the easy steps, fix mistakes in our databases that might otherwise lead to wild goose chases, and obtain more unambiguous evidence about the clinical significance of each mutation.

The genome interpretation of 2017 won’t be a drawn-out process involving constant back-and-forth between highly-paid specialists. It will be a complex but thoroughly automated series of analysis steps, resulting in only a few potentially interesting results to be passed on to geneticists and clinicians for manual checking and signing off. Importantly, it will also be (at least for those who live in the right health systems, or have the right insurance) a dynamic process, where your sequence is constantly checked against new information without the need for complex human intervention.

That’s not to say that clinicians and other specialists will be replaced by the machines – genomicists and informaticians will be constantly at work refining the interpretation systems, but their work will be scaled up to the analysis of hundreds of thousands of genomes. Clinicians will provide the same point-of-care attentiveness (or lack thereof, in some cases) as in the current medical system, but they will do so using carefully processed, filtered and validated information from upstream analysis systems. The idea that each of these specialists will play a time-consuming role in interpreting each individual genome is completely unrealistic, and unnecessary.

Finding known mutations and interpreting novel ones will be easier

Right now, publicly available databases of known disease-causing mutations are shockingly noisy and incomplete – a situation I’ve described in the past as the greatest failure of modern human genetics. This is due to a combination of factors: researchers who published alleged mutations without performing the necessary checks for causality, academics and commercial entities who maintain private monopolies over crucial information from disease-specific studies, and occasional transcription errors by the curators of public databases, to name just a few.

But this will change – or rather, if it doesn’t change then we should be deeply ashamed of ourselves as a research community. Right now it’s unclear which of the many competing efforts to catalogue disease mutations will emerge as the single go-to source, but I’m optimistic that by 2017 both funding bodies and journals will have applied sufficient pressure to ensure that there is at least one fully open, comprehensive, well-annotated and accurate resource containing these variants.

The list of well-established human disease genes will grow massively over the next 18-24 months as genome-wide approaches like exome sequencing are applied to increasingly large numbers of rare disease families. We will also be able to unambiguously discard many of the mutations currently in resources like OMIM, as it becomes clear from large-scale sequencing studies that these variants are in fact common in healthy individuals.

The end result will be an open-access database that any clinical genomicist can tap into when interpreting their patient data – meaning far less time wasted chasing false leads, and fewer true disease-causing variants missed during the interpretation process. That also means clinicians will be handed increasingly clear, intuitive results to deliver to their patients, rather than a long list of “maybe interesting” variants that they are completely unequipped to make sense of.

Genome sequencing technology will be more accurate

Finally, it’s worth emphasising that a lot of the time and expense in clinical genomics right now stems from imperfections in the underlying sequence data. Current short-read sequencing technologies have been phenomenally good at driving sequencing costs down, and across a large swathe of the genome they do a pretty good job of finding important mutations. However, they are still subject to a distressing level of error, and also can’t access approximately 10-15% of the human genome that is highly repetitive or poorly mapped.

That’s all changing fast. The reads generated by these instruments are getting longer and more accurate, meaning they can be used to peer into previously off-limits segments of the genome. New technologies such as Oxford Nanopore promise even more rapid improvements to these parameters – or, at the very least, promise to drive even greater competition among existing providers to up their game. We can confidently expect that the genome of 2017 will be dramatically more accurate and complete than the genome of 2012. Importantly, because the underlying reads are longer and more accurate, it will also be possible to store the raw data underlying a genopme sequence in a far smaller volume of disk space than is the case currently.

Why the alarmism?

It’s worth bearing in mind that there are many people out there with strong incentives to make genome interpretation sound more challenging – and more lucrative – than it actually is. Right now there are dozens of companies launching in the genome interpretation space, and hundreds of venture capitalists who need to be convinced that the market size for genome interpretation is enormous. (I’m not claiming that the authors of the GEN piece have ulterior motives – perhaps they have simply been swayed by widespread exaggeration in the field.)

Let me be clear: in the next 5-10 years, millions of genomes will be sequenced in a clinical setting, and all of them will need some level of interpretation. We will need to build complex systems for securely managing large-scale data both from genomes and (more importantly) from many other high-throughout genomic assays, for accurately mining these data, and for returning results in a format that is easy for clinicians and patients to understand. Billions of dollars will be invested, and some people will get very rich developing companies to create these systems. But the idea that we will be looking at a $500K genome interpretation pipeline is completely absurd.

The annoying thing about this faux obstacle is that there are real challenges ahead. For instance: how can we integrate genome data with information from dynamic, real-time monitoring of patient health? How can we protect patient privacy and build rigorous systems without suppressing innovation? And how can we ensure that new technologies are used to actually improve health outcomes for everyone, rather than simply increasing healthcare costs? None of these questions has an easy answer, and we don’t have much time to figure them out – so let’s not waste our time building costly, imaginary genome interpretation pipelines in the air.

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

33 Responses to “Genome interpretation costs will not spiral out of control”


  • Michael Wosnick

    I honestly don’t know how the original authors came anywhere close to the numbers they arrived at.

    Unless I misunderstood completely, they appear to have added all the annual salary costs into EACH genome. Are they really assuming that, even in the absence of ANY of the efficiencies you talk about in your post, that all of this sundry cast of characters will only be able to analyze a single genome in a year?

    I keep thinking I must have misread all of this since the alternative to me being a careless reader is that the original authors are incredibly naive and/or intentionally very misleading.

  • Rosie Redfield

    Michael, might they be assuming that each genome was of a new species, rather than of a new human individual?

  • Andras Pellionisz

    Just like the first human sequencing was, now the interpretation of the sequence(s) is a nascent “cottage industry” – done mostly by hand. For example, Watson’s genome was analyzed by a couple of full time Baylor expert over a full year. Costs? Uncounted and uncountable. Daniel McArthur is absolutely right that just as sequencing already became an automated industry, (its hitherto missing other half) genome analytics will also be automated, and costs will drop. There is a big, however. The industry of sequencing had to figure out a TECHNOLOGY, however Genome Analytics has to figure out the algorithmic (software enabling) theory of Recursive Genome Function (as for genome regulation, Craig Venter went on record to say “our concepts are frighteningly unsophisticated”). Even technology development can be, usually is, disruptive (just remember the computer mainframe to home computer on a chip disruption). Scientific revolutions, like quantum mechanics to enable industrial use of nuclear energy, are so disruptive that they even changed Western philosophy (of determinism to Heisenberg’s uncertainty principle). Pioneers encounter incredible head-wind in the paradigm-shift from Junk/Genes and Central Dogma to “The Principle of Recursive Genome Function”, and once the old dogmas are surpassed, identify (what is essential in any System Theory approach), that the Genome/Epigenome is a fractal system. At this point, a decade after inception, there is independent experimental proof of concept that fractal defects are implicated with cancer (published last November in Nature by a Harvard/MIT/Dana Farber group). What is done right now is industrialization by HolGenTech of the paradigm-shift algorithmic approach. Price of genome-based choice of best cancer therapy in clinics? A function of the investment into the next boom of industry versus time.

  • Daniel MacArthur

    Michael,

    It’s completely unclear how the estimates were generated – but the total number ($517K) is certainly not far off the combined annual salary of the specialists listed ($447K), plus some hand-waving amount for “software costs”, which is transparently batshit insane.

    Wait – perhaps they’re actually envisioning that each genome will require the combined services of a bioinformatician, a physician, a biologist, a statistician, a geneticist and a technician for a full year, perhaps at some kind of spiritual retreat in which they commune with the patient’s DNA. Hint to VCs who receive a pitch for a business doing this: do not buy.

  • It’s been at least five years since it took an entire year to analyze a single human genome. And even back then, all it took was a couple of post-docs and a few graduate students. (Yes, I’m being tongue-in-cheek.)

    Seriously though, their estimation doesn’t make any sense.

    That said, I do think the constant clamoring of the “thousand dollar genome” is also problematic. Sure, the raw sequencing of a human genome may be a thousand dollars at some point in the next couple years. But in no way will that translate as rapidly to interpretation, clinical delivery, et cetera. Nor should it–certainly the tipping point for genomics becoming clinically relevant isn’t a thousand dollars considering by itself, it delivers tens of thousands of dollars of diagnostic markers were they done by other assaying means.

  • Daniel MacArthur

    Hi MJ,

    All very reasonable points.

    The key thing to remember is that the clinical utility of genomics isn’t going to come (at least initially) from sequencing healthy people and predicting their risk of illness. It will come from sequencing really, really sick people – rare genetic disease patients and cancer victims – and figuring out what’s wrong with them. It is already more cost-effective in many cases to simply sequence an exome rather than push a rare disease patient through the standard gene-by-gene diagnostic pathway, and interpretation of the data in these cases is comparatively straightforward (although still not turn-key, of course).

  • Peter Rogan

    Daniel,
    I suggest that you take a look at the Clinical sequencing program at Medical College of Wisconsin. They have developed a thoughtful multistep process that truly considers the impact of and effort required to interpret variants of uncertain significance. However, before any patient DNA is sequenced, multiple physicians are required to nominate the individual to a genomic sequencing review board. The board considers a variety of issues from heritability to whether there are likely to be benefits (mitigation, counseling) to the patient or their family before approving the project. See: http://www.genome.gov/Multimedia/Slides/GenomicMedicineII/GMII_HJacob_ClinicalSequencingMCW.pdf

  • Gholson Lyon

    I just want to go on the record as supporting Daniel’s points in full. Great post! Getting to what Daniel advocates will require a collaboration between industry and academia in the WGS space. Right now, the only two companies doing WGS with any sort of reliability are Illumina and Complete Genomics, with at least Illumina already having a CLIA-certified process for WGS. The critical thing will be for these companies to demonstrate high-throughput, along with perhaps adapting longer-read length technologies to augment the completeness and accuracy of their whole genomes. If they have any cash at hand, it might be worth acquiring one of these promising longer read nanopore technologies now, although I realize the risk.

  • Daniel MacArthur

    Hi Peter,

    Yes, I’ve seen the Wisconsin system and spoken to Howard about it. I think they’ve done a tremendous job of figuring out the major challenges of implementing genomic medicine. I also think (and Howard freely admits) that their approach cannot conceivably scale to thousands of disease patients – instead, the lessons learned from this extremely intensive program will (hopefully) be boiled down to a far simpler, faster and cheaper system.

  • Matthew Herper

    Related question, though: how low do the analysis costs have to get? Below $10,000? Below $5,000? Saying they won’t be $500,000 (I agree) doesn’t help if $50,000 puts the tech out of reach of most people.

  • Daniel MacArthur

    Hey Matt,

    I actually don’t think there will be any one number. For consumer genomics, I think you could build a perfectly reasonable interpretation engine with a per-unit cost in the low hundreds. If you want a clinical-grade genome in which every variant of potential significance is experimentally validated, passed on to domain specialists for detailed interrogation, and any remaining candidates are subjected to functional studies, you’re going to need some very good insurance indeed…

  • Andras Pellionisz

    Daniel MacArthur is right again “I don’t think there will be any one number” (as price of genome analytics). There isn’t any single price for “surgery”, either. We remember when the first heart transplant was so expensive that nobody paid for it (just as Dr. Watson did not pay a $M for analysis of his genome). Today, (multiple) organ transplants are fairly routine procedures. Genome-based choice of the most fitting cancer therapy will be rather expensive before the industrialization (and automation) will force prices down. As for advice who will profit from footing the bill of getting utterly serious about genome analytics, a “Hint to VCs who receive a pitch for a business doing this: do not buy” may be too late e.g. for Farzad Naimi. While he is not a “Sand Hill Road type VC”, he already launched a most impressive line-up from Stanford, for the genome analytics company “Bina Technologies”. Should I risk to say “a game changer in Silicon Valley with global implications”?

  • michael lerman

    To arrive at the cost of interpreting one fresh genome the combined annual salary of the group should be divided by the number of genomes they can do in one year; this number is not known yet.
    Michael Lerman

  • Kevin Davies

    Far be it for me to criticize a commentary in a competing journal, but jeez, where does one start? As already pointed out, equating a clinical genome analysis to a handful of annual staff salaries makes no sense, as does the projection that the costs will actually increase (dramatically) despite Moore’s Law improvements in sequencing capacity, accuracy, etc.

    I’m as guilty as anyone of putting the “$1,000 genome/$1 million interpretation” meme out there, after interviewing Bruce Korf, past president of the American College of Medical Genetics, for my book ‘The $1,000 Genome.’ (I thought it had a nicer ring than Elaine Mardis’ “$100,000 interpretation” figure, also widely quoted.)

    Bruce’s quote was not intended to be taken literally, but to illustrate that in a clinical setting, the challenge of interpreting and communicating the significance of DNA variants to a patient’s family goes far beyond an accurate cataloging exercise in an era of ubiquitous, affordable whole-genome sequencing. The GEN commentary didn’t mention counseling, for example, which will (if it isn’t already) become a major headache for medical centers offering this technology.

    In my interview (August 2009), Korf talked about “an informatics overload trying to understand what’s clinically significant… You’re faced suddenly with information from 500,000 data points and umpteen conditions with very up and down increments of risk. It becomes a challenge – where do you start?… What it could lead us to is the $1,000 genome and the $1 million interpretation.”

    He went on to discuss the leap from genome sequence to what physicians will encounter clinically. Current newborn screening utilizes tandem mass spec, with about 30 results deemed clinically significant and actionable. But for a newborn NGS screening, “you could be picking up millions of variants. What do you do with them?” he said.

    As an example, Korf talked about his group’s experience studying unknown variants in the NF1 gene. “It’s hard to know exactly what they do. With a lot of effort, you can sort that out… functional assay, gene expression, aberrant splicing… computer modeling, sometimes evolutionary studies. There’s tremendous effort to characterize each and every one of the variants [in a single gene]. Now magnify [across the genome]… You’re going to see a ton of stuff you’re not going to know what it means… That’s a challenge for a generation of medical geneticists. No-one’s going to be able to read the sequence and say this is what you’re going to get, this is what you’re not going to get… It’s going to take at least a generation, possibly more, before we have a clear handle on that. We’re deciphering an extremely nuanced, complicated code.”

    If the moderator will permit a gratuitous plug, this issue will be a prime topic at the upcoming Clinical Genome Conference, an exciting new meeting we’re hosting next month in San Francisco:
    http://bit.ly/TCGC12
    Among the highlights, we’ll be hearing some of the first public presentations from a number of the software companies vying to provide that all-important medical genome interpretation, including Personalis, Cypher Genomics, Knome, and Dietrich Stephan’s new company SVBio.

  • Michael Wosnick

    One of the authors, seemingly in response to Twitter and comments, has all but retracted the comments: http://bit.ly/IVaOWs

    They now say:

    “I agree that there are too many errors to the calculations. It was difficult for us to assume the percentage of their salaries involved in analysing one genome. I definitely agree that it should not take an entire year to analyze one genome. Our estimates are higher than at least a couple of orders of magnitude. Twitter says that it is an ill-reviewed article. Well, it definitely is!
    In order to polish our estimates, I still think considering FTEs is an interesting way to calculate, but we could prorate the salaries according to the time taken to analyze one genome (few weeks or months). Also, the calculations are contingent upon the number of genomes analyzed at a time and this also depends on the research being conducted.”

    I say: astonishing that they could print this rubbish in the first place.

  • Forgive my ignorance in these matters, I work in plant genomics but keep an eye on human genetics, more to monitor methods and techniques rather than results.

    But in reality, hasn’t companies like 23andMe actually set up some of the framework for this? I understand that the genotyping services they provide are not to clinical standards, but obviously they have set up an automated system of analyzing genome scale data for risk prediction, etc.

  • I’m going to endow the Chris Cotsapas Institute, which will recruit physicians, bioinformaticians, statisticians, technicians (and the odd psychiatrist) to sequence my genome then spend the next ten years analyzing it. This will be a 10x better analysis than these guys are even dreaming about! Can’t cost more than $10M or so, right?

  • Richard Resnick

    As CEO of one of the dozens of companies in the genomic interpretation space (and the only one doing actual clinical interpretation), I can say for sure that MacArthur is correct.

    Diagnostic interpretative odysseys are and always will be very expensive because you’re bringing highly skilled and highly paid expertise to bear on what is essentially a medical discovery problem – attaching a diagnosis to a rare phenotype.

    But in actuality much of clinical genomics is routine. It’s processing large datasets to identify things for which you already know where to look. There, the problem is one of automation, validation, massive compute, storage, backup, security, repeatability, and regulatory compliance. Exomes are in the ~$400 range for high coverage clinical interpretation. Genomes maybe 10 times that. And NGS gene panels – well, we eat those for breakfast already. Over time these prices will fall but probably only by 50% in a five year window, as significant components of the price are earning Amazon Cloud-like rather than ORACLE-like margins; there isn’t a lot to squeeze out.

    So sure, if sequencing keeps dropping to, say $100-$500 per genome, interpretation prices will dominate but by less than an order of magnitude.

    Given that MacQuarie reports a $3.6B clinical sequencing market in 2017-ish, there is plenty of opportunity to raise capital and make money here without interpretation costs hitting $1m per genome. You don’t need for interpretation to be 1,000 times more costly than sequencing for there to be an enormous market.

    Richard Resnick
    CEO
    GenomeQuest, Inc.

  • Stuart Nicholls

    I think an important aspect of the more general discussion about the $1000 genome and $1 million dollar interpretation is that the million dollar aspect is not literal, it is metaphoric. We often suggest “that’s the $1 million question” (i.e. it is the crux, or the most important thing) and I think that is what much of the discussion around interpretation is getting at. Yes, we may be able to sequence a genome for $1000 but the important question (i.e. the million dollar question) is actually what is the utility of this information and how do we go about interpreting it. So whilst there is some use in trying to calculate costs of physician time etc, it is a too literal interpretation of what much of the writing is about.

    Stuart Nicholls
    University of Ottawa

  • Daniel MacArthur

    Hi Stuart,

    Right – Kevin also pointed out that Bruce Korf’s coining of this term was metaphorical rather than literal. I wouldn’t criticise Korf or anyone else for using this phrase to make a rhetorical point, although it is ripe for misinterpretation. However, when some says that it will literally cost $500,000 to interpret a genome, we have a serious problem!

  • Geneticist from the East

    “Let me be clear: in the next 5-10 years, millions of genomes will be sequenced in a clinical setting” – I think you are too optimistic. The resistance from the medical community is bigger than you think. Also, clinical compliance will add tremendous cost to the price of sequencing as well as interpretation.

  • Andras Pellionisz

    Looking ahead ten years, I don’t think it is overly optimistic at all to envision (much) more than a million DNA sequencing in clinical setting (Worldwide). The reasons are simple. Certain genome testing (very partial DNA interrogation) is already in practice (e.g. in pharmaco-genomics). Soon, it will be less expensive to do a full sequencing than to do a number of partial interrogations. While resistance from the USA medical establishment, according to the bestseller by Eric Topol, is fierce (“neither side can afford to back down”), simple tests e.g. pharmaco-genomic checking is already gaining acceptance even in the USA. In addition, remember, that the USA, with her 350 million people, maybe 500 million in ten years, is a very small fraction of the Worldwide 7 billion now (surely over 10 billion in ten years). Globally, people will simply travel to those countries, where complex medical procedures will be combined with affordable full sequencing and using that much analytics that will be within their reach. If the USA does it, people will come here (as the Shah of Iran came to Stanford decades before sequencing). Americans used to have to travel to Canada or Mexico to undergo certain procedures – when such was not available in the USA. Presently, one of the most prestigious USA clinical system (Mayo) slated one billion dollars to build a hospital (for international patients…) at the new Hyderabad Airport (speck in the middle of a triangle of Dubai, Malaysia and Japan/Korea/China). “As the future catches you” we could say with Juan Enriquez.

  • Gholson Lyon

    It is quite cool to see that some older (than me!), wiser (than me!) and very intelligent people are taking notice of this website and leaving comments too. Awesome. I do agree that we are undergoing a revolution (finally) in medicine, along the lines of what is described in Eric Topol’s book. And, also see Michael Nielsen’s book on Reinventing Discovery with Networked Science. That being said, there are many people in the medical establishment who are unfortunately quite powerful and basically remain silent on blog posts and other news channels. Rather, these people work behind the scenes consciously or unconsciously to preserve the status quo, obstruct progress for patients/families, and basically enrich themselves and their organizations. It is really quite sad. But, I am hopeful that advocacy at the public level will make such behavior unacceptable and not tolerated, as we move toward a society of individualized medicine in which every human (at least in the industrialized world) controls their own data for their own body and medical health, rather than being controlled by the old, paternalistic system of “medical care” in which doctors “believe” that only “they” can interpret your own data. I spent the past 15 years working in the medical establishment, first with MD/PhD training, then 5 years of residency and then past three years practicing clinically, so I am quite aware of the many obstacles involved. But, I am hopeful for the first time, inspired by Eric Topol’s and Michael Nielsen’s books.

  • Matt Healy

    An area that I think is being underappreciated is infectious diseases. Already in rich countries it borders on malpractice to prescribe anti-HIV drugs without checking whether the patient has a resistant strain of the virus. Within five years I expect in rich countries it will be considered mallractice to prescribe antiobiotics without first determining which bacterial strains the patients have. As for clinician resistance, that will change after a few lawsuits claiming “my client’s mother would still be alive if her physicians had done genetic testing for C. diff. when she first came to the ER.”

  • Dan Vorhaus

    Excellent post. Probably the one additional comment I’d offer – and it seems almost trivial, but I don’t see it raised quite enough – is that the genome interpretation bill, whatever the size, will not come due all at once nor be paid all by the same source.

    As we move forward into clinical genomic interpretation, even if the genomic sequencing happens only once, the interpretation, as Daniel points out, will be ongoing. The data will be reviewed in conjunction with phenotypic expression and in conjunction with both new research findings and new individual data (including environmental data, other non-genomic data specific to the individual).

    Too often, I think, the concept of genome interpretation is presented as something that happens at a single point in time, with the individual (or her insurer) handed a bill at the end for $10,000 or $100,000 or $500,000 or $1M or whatever. Particularly for individuals who are healthy when sequenced, it seems more likely that the interpretation costs, whatever they are, will be spread over a long period of time, leaving genomic interpretation more akin to raising a child over a period of decades than buying a fancy new sports car in a single afternoon.

  • ove the article. I have two points I would like to include: I would appreciate your response ☺

    1. Public versus Designer Databases. My concern is the cohesion and standardization of publicly available databases such as dbSNP and for profit maintained designer databases such as HGMD for example. “One database to rule them all” would be a beautiful thing, but while “private” entities are at work the unification of public and private databases seems less likely to be on the horizon. If HGMD has a monopoly on certain disease causing SNVs (which they do) why would they give up that information for free? Someone has to pay for it. It seems counterproductive that scientists are spending a portion of the taxpayer’s hard earned money to pay $2,000 for an access code. Someone who makes decisions in DC needs to champion this cause ASAP.

    2. On data storage. Best-case scenario a whole human genome only occupies 3 gigs of computational space. Lets assume the genomics community continues to store this information on clouds (amazon.com + NHGRI style). This creates a whole industry for cloud security, which also costs money. And at 3 gigs per genome for 10,000,000 individuals will eventually start to add up financially. The future I see is in what I like to call “boutique-sequencing platforms.” Following Occam’s razor we should only be sequencing what we need. It’s all about context. The blood bank should only be concerned with genes that have to do with blood transfusions for example (why sequence 3 billion bases when you can sequence 3 million). This would be an efficient way to streamline bioinformatics pipelines making data interpretation simpler for clinicians and technicians while saving billions in the process.

  • Interesting post Dan – and as an aside I’m a regular reader of this blog and enjoy it!

    You say “Let me be clear: in the next 5-10 years, millions of genomes will be sequenced in a clinical setting”

    I agree, but, certainly for “socialised medicine” like the NHS in the UK, I would be very surprised if the whole genome gets analysed, but rether just parts of it. At the moment we have a newborn CF carrier screening programme, where neonates with raised immunoreactive trypsinogen (possibly indicative of CF) are screened for the most common mutations in the CF gene. NHS molecular diagnostics labs have a variety of other well-used and carefully tailored assays for other well-known disease genes, of course. All good stuff, but painstaking and expensive.

    Once a “clinical quality” genome is affordable, and I agree with the 5 year timescale, then I expect many newborns to have thier genomes sequenced, but only part of thier genome analysed, for example the CF gene and maybe a handful of other loci.

    Now, of course, the interesting question is what becomes of the rest of the unanalysed genome data? Is it kept by the NHS or given to the parents of the newborn on a usb stick? In 80 years time, we will know a lot more about genomic medicine than we do now, and that child will be getting old with a variety of chronic diseases. Then the data become really useful.(Let’s hope the computers of the day still read USB sticks!)

  • Andras Pellionisz

    Ed Hollox amplified on Dan Vorhaus’ point; “Genome Sequencing in a Repeat Customer Mode”. Since “our concepts are frighteningly unsophisticated” (said Venter about mathematical genome regulation theory; he never claimed it was his forte), one key hidden and false axiom is that a person’s genome is forever, as if A,C,T,G-s were edged in marble. Not so. Trivial, but now documented by Genentech/Complete Genomics that every three cigarettes smoked produces a “de novo” mutation. Less trivial, but within the last six months documented by 3 top-class Nature papers (Boston, Cambridge UK, Michigan, see coverage on hologenomics.com “news”), with advancement of the “genome meltdown” a.k.a. cancer, de novo CNV-s (large segments of copy number variations/alterations) appear in droves, along with scores of other “structural variants”, as well as hypo- or hyper-methylation of even otherwise unchanged sequence. In a clinical setting, everyone is a “repeat customer” for e.g. blood sample (who would ever think that one is good for life???). Likewise, as occasionally full human DNA sequencing is already done in “repeat customer mode”, e.g. tracing advancement or hopefully remission of cancer calls for at least the regular check-up on “suspect” genomic sequences (and their de novo methylation status). This realization, as soon as awareness rises soon enough, will help catapult the “demand-side” of industrialization of genomics, as people will increasingly try to establish an as early as possible “personal reference genome”. Excellent genome sequencing firms have a “glut” of full DNA-s (predicted by 2008 YouTube “Is IT ready for the Dreaded DNA Data Deluge”). True that some deliberately wait with “establishing their own personal reference-genome” till the price-tag drops further. Indeed, it may – but the drop in price already may not be commensurate with missing a chance of having a relatively pristine back-up reference copy.

  • Giulio Genovese

    I agree with the post. It is like as if 20 years ago someone predicted that every company would be paying millions of dollars to search specialist to create large indexes for data around the internet that could potentially be relevant to the company employees (or something like that). We all know what people use nowadays and how much it costs.

    To me it is crystal clear that, given the scale of the amount of information needed to be parsed, this will be a business for those few companies that can afford to have a similar scale in the amount of customers (i.e., people requiring genome interpretation).

    I believe that many physicians are mislead believing that people will in the future come to the hospital asking to have their genome scanned and interpreted (but I am open to hear other views), while in my opinion it will be more likely that people will send their DNA to be analyzed by a specialized company with a larger user base than a single hospital, and then, maybe, go to a hospital to require further help with the interpretation provided. And that for a simple reason. The interpretation will eventually cost less than a visit to the hospital when scaled.

  • Matt Healy

    Perhaps it will be like stockmarket data, for which at present there are a various options with various costs. I can get a few decades of daily closing prices, and tools for basic analysis and technical charting — Aroon oscillators, Bollinger bands, various flavors of moving average, etc., online without having to pay any fee. For somewhat more money I can subscribe to the Wall Street Journal, The Economist, or the Financial Times and use the Subscriber-only parts of their website. But if I want to perform sophisticated multivariate analysis of high-frequency trading data then I will have to pay rather more money.

    Similarly, there may in future be free or low-cost services that do basic analysis of my genome, or I can pay a modest fee to get a deeper analysis, or I can pay a substantial fee to have a human expert in Beijing or Bangalore take a closer look at my genome.

    I am somewhat concerned about the privacy implications of one plausible business model: a website where I can send my genome for analysis without any direct cost to me or my insurers, but I consent to letting advertisers show me targeted ads based on my genome! So if my genome indicates I am at elevated risk for some medical condition then I will see ads for products that treat that condition.

    An under-appreciated aspect of personal genomics is that I expect personal genome analysis will be of much greater potential value for a younger individual than for an older one. If somebody has lived 50 or 60 or 70 years without developing any signs if heart disease, do they really care about whether they have some polymorphisms in genes associated with high risk of heart disease? Finding polymorphisms associated with elevated risk of heart disease at age 20 might have greater utility.

    It’s not yet clear to me what the value of personal genomics, as opposed to using genomes for scientific research, might be in the absence of a particular medical situation. If I’m diagnosed with cancer, I’d probably have my DNA analyzed in case I learn something useful for deciding which drugs to use. But I’m not convinced looking for cancer-related polymorphisms will be ofmuch value for me personally right now. Nor is it clear to me whether looking for polymorphisms in my DNA associated with risk of heart disease will tell me much more than what I know right now from my Framingham Risk Score.

  • Geneticist from the East

    Hi Matt, for older people, the use personal genomics will be more at the personalized medicine side because old people are more likely to get treatments or take drugs.

  • Nefful Teviron Clothing

    I beloved up to you’ll obtain performed proper here. The cartoon is tasteful, your authored subject matter stylish. however, you command get got an shakiness over that you would like be handing over the following. in poor health no doubt come more until now once more since exactly the similar nearly a lot frequently inside of case you shield this hike.

  • lawrence crane enterprises

    Wow, fantastic weblog format! How lengthy have you been running a blog for?
    you make blogging look easy. The total look of your website is
    fantastic, as well as the content!

Comments are currently closed.

Page optimized by WP Minify WordPress Plugin