The peer review system for research

The disclaimer

In the last few years I have often read about the crisis in scholarly (mostly scientific) peer review.
I share the belief that the current system is surely suboptimal and must be changed. Much of what I say below is not original: I have read so many posts and books by Tim Gowers, Bjoern Brembs, Mike Taylor, Michael Nielsen, Michael Eisen, Noam Nisan, and many other people, that I can’t remember them all. I have a year’s experience as managing editor of an open access no-fee journal, two years’ experience as an editor (= referee) for a Hindawi journal, and many years experience as a journal referee. My research area is mathematics and various applications, so there may be some discipline-specific assumptions that don’t work for other fields. And it is not possible to cover every issue in a blog post, so I don’t claim to be comprehensive.

The latest online furore was occasioned by a so-called “sting” operation published in Science (unlike most articles in that magazine, this one is freely readable). I don’t think it worth commenting in detail on that specific article. It tells us little we didn’t know already, and missed a big opportunity to do a more comprehensive study. It does show by example that pre-publication peer review can fail spectacularly. Some other (often amusing) instances from the last few years involve computer-generated papers that are much low quality than the one submitted by Science, presumably accepted by computer-generated editors (even mathematics is not immune and some journals have done this more than once).

Some people have claimed that these weaknesses in peer review are exacerbated by the pay-to-publish model (they are certainly not exclusive to such journals, as the examples above, some published by Elsevier in toll access journals, show). This model certainly does lead to clear incentives for “Gold OA” journals to publish very weak papers. However, since authors have to pay, there are countervailing incentives for authors. If the reward system is poorly organized (as it seems to be in China, for example), then authors may still choose these predatory journals. But since papers in them are unlikely to be read or cited much, it seems unlikely to create a large problem. Journal reputation counts too – most predatory journals receive few submissions, for good reason. The existence of many low quality outlets (which predates the rise of open access journals) is a nuisance and possible trap for inexperienced researchers, and reduces the signal/noise ratio, but is not the main problem.

The main problem is: the currently dominant model of pre-publication peer review by a small number of people who don’t receive any proper payment for their time, either in money or reputation, is unlikely to achieve the desired quality control, no matter how expert these reviewers are. Furthermore, our post-publication system of review to ensure reliability is rudimentary, and corrections and retractions are not well integrated into the literature.

Both deliberate fraud (still quite rare, given the reputational risks, but apparently much more common than I would have thought) and works that are “not even wrong” and thus can’t be checked (poorly designed experiments, mathematical gibberish, etc) slip through far too often. It is bad enough that there are too many interesting papers to read, and then a lot of solid but uninteresting ones. Having to waste time with, or be fooled by, papers that are unreliable is inefficient for readers, and allowing this to go on creates wrong incentives for unscrupulous authors.

It seems now that “publication” doesn’t mean much, since the barrier is so low. A research paper now has no more status than a seminar talk (perhaps less in many cases). Self-publication on the internet is simple. There are so many journals that almost anything can be published eventually. How can we find the interesting and reliable research?

The only good solution that I can see involves the following steps. Clear proposals along these lines have been made by by Tim Gowers and by Yann LeCun.

  • adopt the open research model

    This means more than just making the polished research article freely available. It includes circulation of preliminary results and data. Certainly a paper that doesn’t allow readers to make their own conclusions from the data should be considered anecdotal and not even wrong. Imagine a mathematics paper that doesn’t give any proofs.

  • decouple “peer review” from publication

    There can be two kinds of services: assistance (writing tips, pointers to literature, spotting errors) with the paper before it is ready for “publication”, and comment and rating services (which can give more refined grades on quality, not just the current yes/no score.)

    Journal peer review focuses on the second type, but only gives yes/no scores (sometimes, a recommendation to submit to another journal). Computer science conferences are good for the first type of review, in my experience, but bad at the second. The first type of service was is offered by Rubriq, Peerage of Science, Science Open Reviewed. The second type is currently offered by Publons, SelectedPapers.net (no ratings yet), PubPeer.

    This allows people with more time to specialize in reviewing, rather than writing. And they should get credit for it! A colleague in our mathematics department told me in June that he had just received his 40th referee request for the year. He is too busy writing good papers to do anything like that amount of work. Yet PhD students and postdocs, or retired researchers, or those with good training whose job description does include intensive research (such as teaching colleges) could do this job well. To keep this post from getting even longer, I will not discuss anonymity in reviewing, but it is an important topic.

    Other advantages are that post-publication review boards could bid for papers, so the best ones are reviewed quickly by the best reviewers, multiple review boards could review papers, and reviews are not wasted by being hidden in a particular journal’s editorial process.

  • decouple “significance” from inherent quality measures

    Journals also routinely reject on grounds of their own idea of “significance”, which is inefficient (especially when they publish “important” work that is “not even wrong”). The real determination of how important and interesting a paper is can only be done after publication and takes a long time. In some fields, replication must be attempted before importance can be determined. PLoS does this kind of filtering and seems to be successful. Pre-registration of experimental trials which will lead to publication whatever the result, and registered replication reports, are other ways to reduce the bias toward “glamour mag science”.

  • if you want attention for your work, you may have to pay for it

    There ought to be a barrier to consuming expert time. It is limited, and refereeing junk papers for free is a big waste of it. I would like to see a situation where it costs authors something (money, reputation points, in-kind work) to command attention from someone else (if the work is exciting enough that people will do it for free, then so much the better). This doesn’t preclude authors making drafts available and seeking freely given feedback. However, more detailed pre-publication attention might be obtained by various means: give seminar talks and present at conferences, pay via money or formalized reciprocal arrangement. Post-publication attention is another matter.

  • complete the feedback loop

    No system can work well unless information on performance and opinions is allowed to flow freely. Reviewers must themselves be able to be reviewed and compared. Strong ethical guidelines for reviewers should be set, and enforced. The current system allows anonymous referees to do a poor job or an excellent one, and only their editor knows both who they are and their performance level.