The first steps towards a modern system of scientific publication

About a year ago on this site, I discussed a model for addressing some of the major problems in scientific publishing. The main idea was simple: replace the current system of pre-publication peer review with one in which all research is immediately published and only afterwards sorted according to quality and community interest. This post generated a lot of discussion; in conversations since, however, I’ve learned that almost anyone who has thought seriously about the role of the internet in scientific communication has had similar ideas.

The question, then, is not whether dramatic improvements in the system of scientfic publication are possible, but rather how to implement them. There is now a growing trickle of papers posted to pre-print servers ahead of formal publication. I am hopeful that this is bringing us close to dispensing with one of the major obstacles in the path towards a modern system of scientific communication: the lack of rapid and wide distribution of results.*

Solving the distribution problem

It’s worth restating why one might say that the system of pre-publication peer review has a “distribution problem”. What I’m referring to with this is the severe lag time between the time a scientific result is prepared for publication and the time that it is distributed. In my experience, this lag time is on average about six months, with a non-trivial long tail of papers that take much longer. To put this in context with some back-of-the-envelope calculations, let’s define a unit of time called a Scientific Career (SC), and let 1 SC equal 30 years. If there are 50,000 papers published in biology per year (this number is somewhat random, but probably within an order of magnitude given that about 500k papers are added to PubMed per year), and on average each paper takes 6 months to go through the review process, then each year ~800 Scientific Careers are spent bringing papers from initial submission to formal publication. It would be a laughable to argue that 800 SCs of research or value have been added to the papers during this process (let’s be honest–for most of that time the papers are just sitting on someone’s desk waiting to be read). The system of pre-publication peer review thus dramatically retards scientific progress.

The solution to this problem relies on a simple observation–in my field, I am completely indifferent to whether a paper has been “peer-reviewed” for the basic reason that I consider myself a “peer”. I do not think it extremely hubristic to say that I am reasonably capable of evaluating whether a paper in my field is worth reading, and then if so, of judging its merits. The opinions of other people in the field are of course important, but in no way does the fact that two or three nameless people thought a paper worth publishing influence my opinion of it. This immediately suggests a system in which papers are posted online as soon as the authors think they are ready (on so-called pre-print servers). This system is the default in many physics, math, and economics communities, among others, and as far as I can tell it’s been quite successful.

In genetics, a handful of people, most notably Graham Coop, Titus Brown, and Leonid Kruglyak, have been making the case for posting research to pre-print servers (mostly arXiv) ahead of formal publication. A recent news piece in Nature even went so far as to call this a “trend” in the field. Should this develop into a real trend and become standard, one can imagine a system in which rapid dissemination of pre-prints occurrs alongside the current peer-review system for judging the importance and technical quality of papers. Most journals I’m familiar with seem amenable to this sort of approach (with the frankly bizarre exception of Genome Research), and it would be a dramatic improvement over the status quo. There is no reasonable objection to this: if you think the current system of pre-publication peer review is just grand, surely the same system with the added feature that you get to see all the work in your field ahead of formal publication and judge it for yourself (if you so desire) is even better.

Looking forward

Of course, the problem of rapid dissemination of results is only one of the issues with the current system of peer review. Most importantly, for pre-prints not directly in my area of expertise, I have only a limited ability to evalutate their quality, nor do I completely trust three nameless reviewers to evaluate their quality for me. There are a number of potential solutions here, some of which were discussed in my previous post. But in terms of a first step, the hopefully rapid adoption of pre-print servers in genetics can only be a good thing.

*UPDATE 8/18/12 Slight changes to language for clarity

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed
  • Reddit

30 Responses to “The first steps towards a modern system of scientific publication”


  • I brought up this topic of conversation to a colleague, and he pointed out this nice comment from Nature Biotech.
    http://65.199.186.23/nbt/journal/v19/n12/full/nbt1201-1087a.html

  • Joe Pickrell

    Ha, I have no intention of calling a press conference to announce my preprints!

  • There is still the issue of where to send pre-prints of non-quantitative bio.. I have been posting on website, but there hopefully will be a better solution soon, with PeerJ coming.

  • Also figshare is an option.

  • I thought most high-profile journals (e.g. nature, science, cell, pnas and their specialized journals) require embargo and wouldn’t let you publish your paper if you put it on the arxiv first – is this wrong, and only genome research has this policy? or you just didn’t ask the other journals explicitly?

  • Joe Pickrell

    Here’s Nature:

    “Nature never wishes to stand in the way of communication between researchers…Communication between researchers includes not only conferences but also preprint servers. As first stated in an editorial in 1997, and since then in our Guide to Authors, if scientists wish to display drafts of their research papers on an established preprint server before or during submission to Nature or any Nature journal, that’s fine by us.“.

    Here’s PNAS:

    “Preprints have a long and notable history in science, and it has been PNAS policy that they do not constitute prior publication. This is true whether an author hands copies of a manuscript to a few trusted colleagues or puts it on a publicly accessible web site for everyone to read”.

    And Science:

    “We do not consider manuscripts that have been previously published elsewhere. Posting of a paper on the Internet may be considered prior publication that could compromise the originality of the Science submission, although we do allow posting on not-for-profit preprint servers in many cases“.

    It does seem that Cell doesn’t allow preprints.

  • Joe Pickrell

    For what it’s worth, the publishing groups that explicitly allow posting of preprints are Nature Publishing Group, PLoS, Elsevier, and Oxford Press, as far as I’m aware. The one that does not is Cell Press. Obviously anyone wondering about a given journal should check the relevant website, but the vast majority of journals, even high-profile journals, are ok with it.

    This makes sense–journals are aware that the reason they exist is for the the selection/improvement of the right papers for their audience, not the actual distribution of papers (which is of course trivial nowadays).

  • Toilet scientist

    Hi Joe,

    While humorous, I think your “800 careers per year are spent in peer review” comment is misleading. It’s often possible to compare a big number aggregated across a large group (time spent reviewing all biology papers) to a huge amount of time for a single individual (a research career) to make it seem impressive. In truth, though, that reviewing time is spread across a lot of scientists, and so isn’t so crippling. Put another way, assume there are 1,000,000 research biologists on earth (probably right within an order of magnitude?) and each one spends 15 minutes a day in the toilet. That means a whole career (~30y) is spent in the loo every day!

  • @Toilet scientist,

    While my tone is somewhat tongue-in-cheek, I’m actually dead serious. Do you think the 15 minutes each scientist spends shitting per day is an essential part of the scientific process? :)

    My main point is this: “if you think the current system of pre-publication peer review is just grand, surely the same system with the added feature that you get to see all the work in your field ahead of formal publication and judge it for yourself (if you so desire) is even better”

  • nextgenseek

    Nice post. Just hope the current small preprint friendly groups grow fast.

    Here is Cell’s policy on prepublication and does not like the preprint servers. ( http://www.cell.com/authors#permissions)
    “Work intended for submission to Cell, currently under consideration at Cell, or in press at Cell may not be discussed with the media before publication. Providing preprints, granting interviews, discussing data with members of the media, or participating in press conferences in advance of publication without prior approval from the Cell editorial office may be grounds for rejection. ”

    We are trying to collate different journal’s policy on a post in our blog here. http://nextgenseek.com/2012/08/preprint-server-arxiv-friendly-journals/
    It is a small list now and hope to have more.

  • To follow up more seriously, the current system where research articles are invisible to most of the world until official publication is a major inefficiency. Once can argue whether it’s a just a major inefficiency or a horrible, system-choking inefficiency, but in any case it’s worth attention. The proposal here is simple: allow scientists to communicate with each other as they please. This seems as uncontroversial as a proposal as can possibly exist, almost as radical as proposing that scientists meet in groups (let’s call them “conferences”) where they exchange ideas without editorial oversight.

  • This is absolutely right as far as it goes, and Joe is to be applauded for this post, but the problems with scientific publication run even deeper.

    Summary

    – higher ed is about to go bankrupt
    – most academic research outside of a few top universities is never cited, non-reproducible, and should never have been funded
    – citation analysis shows only a tiny group of scientists actually moves any given field forward
    – science will increasingly go back to the future, becoming gentleman science
    – the angel investor is the new professor, and the entrepreneur is the new grad student
    – one model: make a few million in the startup sector and then do science unencumbered for the rest of your life
    – another model: find a wealthy patron willing to fund you via vehicles like Thiel Fellowships (just like Soros or Hughes Fellowships)
    – a third model: bring down the costs of doing research with things like openpcr.com and hack on stuff in your garage after your day job
    – the new model for pure research is github, citizen science, open source, and reproducible research, not universities

    Bugs and patches, not retractions
    Take a look at the issue list for a popular open source data visualization tool named d3 on github. It is accepted that even a shipping piece of production code will have many serious bugs. There is a process of constant improvement for things as mission critical as the Linux Kernel. While this should go without saying for the computer scientists in the audience, Linux sure does require in depth knowledge of algorithms, and it is by no means just bookkeeping/theoretically uninteresting code.

    Contrast this situation to academia. While many give lip service to the concept of science as a continually correcting enterprise, in practice a correction (let alone a retraction) can be career damaging or career ending, especially for those pre-tenure. With the death penalty for failure, academics have every incentive to stonewall requests for materials, data, or source code. This is not limited to the biological sciences (e.g. “Even if WMO agrees, I will still not pass on the data. We have 25 or so years invested in the work. Why should I make the data available to you, when your aim is to try and find something wrong with it.” ).

    The solution: adopt the culture of open source, where source is assumed to be fragile and bug reports are met with patches. Reject the culture of academia, where “peer reviewed” papers are assumed to be correct, while corrections and retractions carry a career penalty.

    Academia is funded by a tuition bubble that is about to pop
    Academics rightly bemoan Elsevier’s extortionary journal costs. What they don’t realize, however, is that Elsevier is a billion dollar remora feeding off the trillion dollar academic bubble. Publishing will be reinvented as soon as academia goes bust. With the advent of Coursera, Udacity, and Khan Academy, which offer elite higher ed content online (plus certification) for free, that will happen within the next ten years, probably within the next five.

    Because there is just no way anyone will pay $250,000 and four years of their life in a down economy when they can get the same education and now a job by doing these online certifications in machine learning. The decline will be as rapid and irreversible as the fall of print media. The people reading GNZ tend to be ahead of the curve and will get more involved with non-traditional funding sources early. But others in academe will likely be in denial till the ship hits the iceberg.

    Solution: replace the unsustainable cross-subsidy which uses undergraduate tuition to fund research with new business models. Put basic certification online via vehicles like Coursera, and encourage the smartest in society via institutions like YCombinator to actually ship products that work, rather than non-reproducible papers.

    Most papers are not reproducible
    Most academic papers are flatly non-reproducible. See here and here. Many Nature and Science papers are highly conditional studies on cell lines that don’t replicate outside of the publishing laboratory (to be charitable) or do not replicate at all. When you try to commercialize a study, that’s when you find out how much stuff doesn’t really work:

    This problem is accepted to the point that the most successful venture capitalists have learned to reproduce results by independent observers before they commit to early-stage funding.
    Outside of orthopedics, the pharmaceutical company Bayer recently reported an example of this problem. In September 2011, Bayer published an evaluation of 67 published studies in which they failed to duplicate two-thirds of the results with their in-house experiments.

    As an aside, this observation is interesting as it neatly inverts the standard academic presumption that good/pure science is done within academia and bad/conflicted science is done in industry. In point of fact, to ship a product the science must be absolutely unimpeachable (or else it is obvious that the product is nonfunctional), whereas to ship a paper one must only pass the filters of 2-3 people and avoid the limelight. Just look at most of the papers in JBC or any mass-production journal for examples of the latter approach.

    Solution: reject any in-depth study that cannot be regenerated via `make` or the equivalent from source code, templates, and data hosted on a public server. Encourage the automation and thus the reproducibility of basic laboratory processes, and if infeasible to automate, provide video documentation of experimental protocols at the standard of jove.com using inexpensive video-editing software.

    Conclusion

    Academia is rapidly headed for a reckoning. Scientific publication will be “solved” as a consequence, with the future looking like reproducible research hosted on github and cooperatives like biocurious.org. Most work will be done open-source, by citizen scientists, with larger projects funded by independently wealthy technologists and/or investors who see the possibility of turning a pure research finding into a scaleable product.

  • Ernst Hafen

    Publication delay is one problem but there are two other ones. I think pre-peer review is of value. Even if many of our papers have not been published in the journals we wanted the peer feedback was usually helpful and made the papers better. Therefore pre-peer review is a value that I am ready to pay for – like I pay for chips and RNAseq services. On the other hand, reviewing papers and grants (and twitter, blogs) are essential contribution to the scientific discourse that are not merited today. Every student for her BSc/ MSc degree should receive a science author-ID that he/she would use for all her subsequent scientific activities (papers, reviews blogs etc). All these contributions could be rated by the community post-publication. We would get a more balanced view of a persons standing in the scientific community. Here is a possible scenario:

    – An author is ready to publish her results and wants to obtain a peer feedback. She submits to a new open access journal and pays for the publication right (including peer review and editorial assistance).
    – Upon receipt of the reviews she decides how to respond (more experiments, modifications etc) and then publishes the results signed with her author-ID
    – The paper will be published together with the review that are signed by the author-ID (scrambled?) of the reviewers.
    – Since there is post publication rating and commenting (i.e. hypothes.is) with author-IDs an author’s work and that of the reviewers will be rated by the community.
    – Ultimately, one could also envision a science wikipedia that is curated by leading scientists in the field and to which one only submits results and conclusions which will, upon peer review, be inserted. Wikigenes was a first attempt at this.

  • @Ernst Hafen

    If you post papers to preprint servers, the reviews you get after submitting to a journal are no longer really “pre-publication”. Otherwise I agree–posting to preprint servers is only a solution to problem of publication delay. The problems of how to sort and judge papers and authors is a completely different problem. The ideas you suggest are reasonable suggestions for this problem.

  • Joe Pickrell

    @asdf

    I agree with a lot of what you write. This post was not meant to look 10 years in the future, but rather more like 1. It’s quite astonishing to me that there’s resistance in the scientific community even to a system for sharing preprints, which is a simple and obvious improvement to the system of scientific communication. I’d like to start with that before we can move on to more fundamental issues.

  • I will say up front I think preprint servers are great and publication of all scientific results both positive and negative in a transparent manner is very import

    I do wonder if this hasn’t changed the nature of scientific publication for physics (and I think physics uses the same model of peer review as everyone else don’t they) why will its uptake change things for genetics or more widely biology

  • @Laura perhaps because the time is ripe for the model of publishing to change (although I’m sure folks thoughts that 20 years ago too). Certainly journals like PLoS One are starting to blur publishing boundaries.

    I think the main difficulty with changing the larger system is how Universities/Grant Agencies will assess the quality of the people they employ/fund. Obviously departments should be carefully reading people’s papers when deciding on hires/promotions. But major promotions/hires, and pay increases etc, are often reviewed by an additional level at the college/university level. These committees often have no expertise in a particular field, and so are forced to decide based on perceived proxies for success (e.g. journal status).

    Perhaps if pre-prints really took off, then early paper-based citation indexes could be used instead (if they could be collated across preprints and main pub). As all we really need is some measure of what particular communities sees as being important steps forward.

  • Following up on my last point: In fact Google already seems to include preprint citations in google scholar profiles, so google scholar based hindex already includes them.

  • Joe Pickrell

    @Matt,

    As Graham suggested, figshare seems like a legit option. I’d never really looked at it, but have been browsing a bit today and it seems like it can be used perfectly well as a preprint server. Doesn’t look like many people use it that way, but there’s no reason why not to.

  • this asdf character should get a blog :-)

    here is a comment on my blog re: arXiv:
    There are good things to prepublication, but in my experience publishing in economics (with a strong prepublication tradition) and evolutionary biology, the cons outweigh the pros. So called working papers are in limbo, often times inflating publication lists artificially , sometimes lingering around for ages without ever making it into a proper journal. Sometimes they are cited and it is difficult to find them, or to track the zillion versions available somewhere online. They also mess up publication statistics, and I have heard at least once about a scoop. It’s a mess, and it promotes plagiarism (check the “admin note: text overlap” here: http://arxiv.org/abs/1206.6700).

    When I ask most economist why would they stick to the wrong equilibrium most would say that they do, in order to signal that you are “doing something”. This is necessary just because editorial processes in econ journals take ages. It’s true we need to figure out this publication thing anew, but arxiv is a sloppy temporary solution. I wouldn’t go there, unless strictly necessary.

    this is one of the most negative coherent comments i’ve seen in relation to arXiv. and i don’t think that it’s too negative. better these problems than what biology has right now.

  • Pubmed Central would seem like a preferred venue for preprint publishing in biology.

    However, at the present time their acceptance criteria is seemingly limited to peer-reviewed publications: http://publicaccess.nih.gov/submit_process.htm

    NIH sponsorship would go a long way to helping adoption, and would be a natural extension of their current preprint requirements for grantees.

  • @razib

    That comment from your site is absurd. How is it possible that “sometimes they are cited and it is difficult to find them, or to track the zillion versions available somewhere online”? If you have an arXiv citation, you know the exact url, and the different versions are conveniently labeled “v1”, “v2”, “v3”, etc. If it’s a citation to the journal version, just find the journal version, no?

    And for publication lists being “inflated”, who cares? That happens now; what you need is a better metric for measuring the quality of a researcher than just counting numbers of publications.

  • @razib

    better these problems than what biology has right now.

    This is exactly right.

  • @Razib Seems to me that the plagiarism/textoverlap detection on the arXiv is a good thing, so not quite sure what the commenter’s point was. Clearly this can also happen with print journals, and as far as I know none are running plagiarism detection software (in part perhaps because some journals are not open to text mining etc). There’s some discussion here: http://network.nature.com/groups/precedings/forum/topics/955. Obviously authors could plagiarize from the arXiv/preprint server, and then not submit there, but that can happen with any journal.

  • I agree with many of your points, and am in favour of a transparent publication process, but wanted to make one point about the value of peer-review. You describe peer review as “two or three nameless people thought a paper worth publishing”. In fact, peer review goes way beyond this – in many cases the peer reviewers check the methods, the stats, ensure the conclusions actually reflect the results, and so forth. Publishing a non-peer-reviewed manuscript is a great way to get your results out there quickly and allow others to judge it on its own merits. Over time, post-publication review will take place, the cream will rise to the surface, and people who don’t know the field will be able to feel confident in the conclusions.

    However, shouldn’t there be some process to stop someone inexperienced or from outside the field reading a flawed article soon after publication and an incorrect message being cited, used in grant applications, etc., before the social or collaborative post-publication peer review has had chance to work? People will review the articles they find most interesting, so how long will it take for the less groundbreaking but still solid and useful articles to get enough reviews for people to have confidence in them?

    I am not saying that post-pub review is a bad idea, but that there do need to be some checks and safeguards.

  • Joe Pickrell

    However, shouldn’t there be some process to stop someone inexperienced or from outside the field reading a flawed article soon after publication and an incorrect message being cited, used in grant applications, etc.,

    As you’re likely aware, incorrect peer-reviewed papers are cited, used in grant applications, etc. Actually having pre-prints might be better: others can see that it is “bleeding edge”, unreviewed work and thus treat it with caution/increased skepticism if they so desire.

    In general though, this post is not about post-publication review. I think a perfectly fine short-term situation is one where traditional peer-reviewed journals give a stamp of approval to papers. Deciding whether to give this stamp of approval takes time; during that time, I think papers should be available. Treat preprints the way you’d treat information from any other website–read it carefully, double-check it, and make sure you believe it. For papers in my field, I’m capable of doing that.

  • The way I view it, the goal of peer-reviewing is to minimize the risk of bad papers catching the attention of a large audience. A correct argument should then take into account how many SC are saved by busy scientists being saved from having to go through papers of limited value. You could say that peer-review is like democracy. It is not perfect, but it does disallow the real bad stuff from making it through (most of the time).

  • I should reiterate something, because this point (though not particularly subtle) seems to keep getting lost: posting preprints is not a replacement for peer review. It is generally done at the same time a paper is submitted for formal peer review, so that results can be rapidly distributed to colleagues. It is thus complementary to peer review. If you’re worried you won’t be able to judge a preprint (and will thus waste your time going through useless papers), wait for the peer-reviewed version.

  • Indiana Internet Service Provider

    I simply could not depart your site prior to suggesting that I extremely loved the standard information an individual provide on your guests? Is gonna be back frequently to check up on new posts

  • Interesting ideas. Another one to think about:

    What if, instead of a not-for-profit preprint server and peer-reviewed journals operating in parallel, the papers on the preprint server are peer reviewed? The authors will pay 3 or 4 peer reviewers (instead of publication fees) to review thier paper. The reviews are posted in full online, with the reviewers name released after peer review, with an associated score reflecting, informally, the kind of journal it would be suitable for (10 for Nature Genetics, 8 for AJHG, Genome Res, 2 for Human Heredity etc).

    This would have several advantages. The fee from authors to peer reviewers could either be a source of income for a freelance scientist, or could be paid directly to the university. Either way, the value of an individuals time in doing this will be measured. Universities will value it: at the moment there is no incentive for me to review a paper, let alone do a good review (I am, of course, meticulous and eager to review all manuscripts!). If reviewing papers either financially or in career considerations, more the better.

    You could even have a personal impact factor based on the papers you review and are subsequently cited reflecting esteem in the community rather than the one big splash paper.

Comments are currently closed.

Page optimized by WP Minify WordPress Plugin