Open Access Week

Open Access Week was low-key at UoA this year. I couldn’t attend a talk by Matt McGregor from Creative Commons Aoteoroa – here are the slides. I was on a panel discussion concerning measures of research impact, which included Jason Priem of ImpactStory and Andrew Preston of Publons.Thanks to Fabiana Kubke and Siouxsie Wiles for organizing.

The peer review system for research

The disclaimer

In the last few years I have often read about the crisis in scholarly (mostly scientific) peer review.
I share the belief that the current system is surely suboptimal and must be changed. Much of what I say below is not original: I have read so many posts and books by Tim Gowers, Bjoern Brembs, Mike Taylor, Michael Nielsen, Michael Eisen, Noam Nisan, and many other people, that I can’t remember them all. I have a year’s experience as managing editor of an open access no-fee journal, two years’ experience as an editor (= referee) for a Hindawi journal, and many years experience as a journal referee. My research area is mathematics and various applications, so there may be some discipline-specific assumptions that don’t work for other fields. And it is not possible to cover every issue in a blog post, so I don’t claim to be comprehensive.

The latest online furore was occasioned by a so-called “sting” operation published in Science (unlike most articles in that magazine, this one is freely readable). I don’t think it worth commenting in detail on that specific article. It tells us little we didn’t know already, and missed a big opportunity to do a more comprehensive study. It does show by example that pre-publication peer review can fail spectacularly. Some other (often amusing) instances from the last few years involve computer-generated papers that are much low quality than the one submitted by Science, presumably accepted by computer-generated editors (even mathematics is not immune and some journals have done this more than once).

Some people have claimed that these weaknesses in peer review are exacerbated by the pay-to-publish model (they are certainly not exclusive to such journals, as the examples above, some published by Elsevier in toll access journals, show). This model certainly does lead to clear incentives for “Gold OA” journals to publish very weak papers. However, since authors have to pay, there are countervailing incentives for authors. If the reward system is poorly organized (as it seems to be in China, for example), then authors may still choose these predatory journals. But since papers in them are unlikely to be read or cited much, it seems unlikely to create a large problem. Journal reputation counts too – most predatory journals receive few submissions, for good reason. The existence of many low quality outlets (which predates the rise of open access journals) is a nuisance and possible trap for inexperienced researchers, and reduces the signal/noise ratio, but is not the main problem.

The main problem is: the currently dominant model of pre-publication peer review by a small number of people who don’t receive any proper payment for their time, either in money or reputation, is unlikely to achieve the desired quality control, no matter how expert these reviewers are. Furthermore, our post-publication system of review to ensure reliability is rudimentary, and corrections and retractions are not well integrated into the literature.

Both deliberate fraud (still quite rare, given the reputational risks, but apparently much more common than I would have thought) and works that are “not even wrong” and thus can’t be checked (poorly designed experiments, mathematical gibberish, etc) slip through far too often. It is bad enough that there are too many interesting papers to read, and then a lot of solid but uninteresting ones. Having to waste time with, or be fooled by, papers that are unreliable is inefficient for readers, and allowing this to go on creates wrong incentives for unscrupulous authors.

It seems now that “publication” doesn’t mean much, since the barrier is so low. A research paper now has no more status than a seminar talk (perhaps less in many cases). Self-publication on the internet is simple. There are so many journals that almost anything can be published eventually. How can we find the interesting and reliable research?

The only good solution that I can see involves the following steps. Clear proposals along these lines have been made by by Tim Gowers and by Yann LeCun.

  • adopt the open research model

    This means more than just making the polished research article freely available. It includes circulation of preliminary results and data. Certainly a paper that doesn’t allow readers to make their own conclusions from the data should be considered anecdotal and not even wrong. Imagine a mathematics paper that doesn’t give any proofs.

  • decouple “peer review” from publication

    There can be two kinds of services: assistance (writing tips, pointers to literature, spotting errors) with the paper before it is ready for “publication”, and comment and rating services (which can give more refined grades on quality, not just the current yes/no score.)

    Journal peer review focuses on the second type, but only gives yes/no scores (sometimes, a recommendation to submit to another journal). Computer science conferences are good for the first type of review, in my experience, but bad at the second. The first type of service was is offered by Rubriq, Peerage of Science, Science Open Reviewed. The second type is currently offered by Publons, SelectedPapers.net (no ratings yet), PubPeer.

    This allows people with more time to specialize in reviewing, rather than writing. And they should get credit for it! A colleague in our mathematics department told me in June that he had just received his 40th referee request for the year. He is too busy writing good papers to do anything like that amount of work. Yet PhD students and postdocs, or retired researchers, or those with good training whose job description does include intensive research (such as teaching colleges) could do this job well. To keep this post from getting even longer, I will not discuss anonymity in reviewing, but it is an important topic.

    Other advantages are that post-publication review boards could bid for papers, so the best ones are reviewed quickly by the best reviewers, multiple review boards could review papers, and reviews are not wasted by being hidden in a particular journal’s editorial process.

  • decouple “significance” from inherent quality measures

    Journals also routinely reject on grounds of their own idea of “significance”, which is inefficient (especially when they publish “important” work that is “not even wrong”). The real determination of how important and interesting a paper is can only be done after publication and takes a long time. In some fields, replication must be attempted before importance can be determined. PLoS does this kind of filtering and seems to be successful. Pre-registration of experimental trials which will lead to publication whatever the result, and registered replication reports, are other ways to reduce the bias toward “glamour mag science”.

  • if you want attention for your work, you may have to pay for it

    There ought to be a barrier to consuming expert time. It is limited, and refereeing junk papers for free is a big waste of it. I would like to see a situation where it costs authors something (money, reputation points, in-kind work) to command attention from someone else (if the work is exciting enough that people will do it for free, then so much the better). This doesn’t preclude authors making drafts available and seeking freely given feedback. However, more detailed pre-publication attention might be obtained by various means: give seminar talks and present at conferences, pay via money or formalized reciprocal arrangement. Post-publication attention is another matter.

  • complete the feedback loop

    No system can work well unless information on performance and opinions is allowed to flow freely. Reviewers must themselves be able to be reviewed and compared. Strong ethical guidelines for reviewers should be set, and enforced. The current system allows anonymous referees to do a poor job or an excellent one, and only their editor knows both who they are and their performance level.

Paradoxes of runoff voting

The New Zealand Labour party will soon have an election for leader of its Parliamentary caucus. The voting system is a weighted form of instant runoff using the single seat version of Hare’s method (instant runoff/IRV/alternative vote). IRV works as follows. Each voter submits a full preference order of the candidates (I am not sure what happens if a voter doesn’t rank all candidates but presumably the method can still work). In each round, the voter with smallest number of first preferences (the plurality loser) is eliminated, and the candidate removed from the preference orders, keeping the order of the other candidates the same. If there is a tie for the plurality loser in a round, this must be broken somehow.

The NZLP variant differs from the above only in that not all voters have the same weight. In fact, the caucus (34 members) has a total weight of 40%, the party members (tens of thousands, presumably) have total weight 40%, and the 6 affiliated trade unions have total weight 20%, the weight being proportional to their size. It is not completely clear to me how the unions vote, but it seems that most of them will give all their weight to a single preference order, decided by union leaders with some level of consultation with members. Thus in effect there are 34 voters each with weight 20/17, 6 with total weight 20, and the rest of the weight (total 40) is distributed equally among tens of thousands of voters. Note that the total weight of the unions is half the total weight of the caucus, which equals the total weight of the individual members.

IRV is known to be susceptible to several paradoxes. Of course essentially all voting rules are, but the particular ones for IRV include the participation paradoxes which have always seemed to me to be particularly bad. It is possible, for example, for a candidate to win when some of his supporters fail to vote, but lose when they come out to vote for him, without any change in other voters’ behaviour (Positive Participation Paradox). This can’t happen with three candidates, which is the situation we are interested in (we denote the candidates C, J, R). But the Negative Participation Paradox can occur: a losing candidate becomes a winner when new voters ranking him last turn out to vote.

The particular election is interesting because there is no clear front-runner and the three groups of voters apparently have quite different opinions. Recent polling suggests that the unions mostly will vote CJR. In the caucus, more than half have R as first choice, and many apparently have C as last. Less information is available about the party members but it seems likely that C has most first preferences, followed by J and R.

The following scenario on preference orders is consistent with this data: RCJ 25%, RJC 7%, CRJ 10%, CJR 30%, JRC 20%, JCR 8%. In this case, J is eliminated in the first round and R wins over C in the final round by 52% to 48%. Suppose now that instead of abstaining, enough previously unmotivated voters decide to vote JRC (perhaps because of positive media coverage for J and a deep dislike of C). Here “enough” means “more than 4% of the total turnout before they changed their minds, but not more than 30%”. Then R is eliminated in the first round, and C wins easily over J. So by trying to support J and signal displeasure with C, these extra voters help to achieve a worse outcome than if they had stayed at home.

The result of the election will be announced within a week, and I may perhaps say more then.

Terry Tao visit

We were privileged to have a visit from Terry Tao today. He gave a nice talk on work with Ben Green on the orchard planting problem. For me even the Sylvester-Gallai theorem was news. His public lecture on the “Cosmic distance ladder” drew a huge crowd. I very much enjoyed the historical discussion of how astronomical distances were measured. It is interesting to think about why Aristarchus’s ideas were rejected by peer review. I have a renewed respect for Kepler and learned that Io orbits Jupiter every 42.5 hours.

He is touring the country as the 2013 Maclaurin Lecturer for the next several days. Definitely worth seeing!

NZ Mathematical Society Newsletter

I have accepted the job of Editor of the New Zealand Mathematical Society Newsletter. This publication has been going since 1974, and has fallen on hard times lately. It is time for a shakeup, but this will start slowly.

Some of the old issues make very interesting reading. The very first one shows that although times have certainly changed, many features of the NZ mathematical scene remain the same. At least now we don’t have to produce the newsletter with a typewriter and cyclostyle machine!

Freedom and security

I have followed the PRISM revelations with dismay. Despite attempts to downplay its significance by those who assert that “privacy is old-fashioned” or “if you have done nothing wrong then you have nothing to fear”, a line has been crossed that ought not to have been without major public discussion. There has been a presumption of privacy for hundreds of years, and totalitarianism is not unthinkable in our so-called “free” societies.

In New Zealand the increasingly unimpressive-looking government has put forward legislation in this area that seems ill-conceived and is at the very least far too rushed.

There is Public meeting tonight, and a national protest planned for Saturday. It is true that some people attend far too many protests, but it seems to me that if you are ever going to protest anything, it should be this. Selling off state assets seems potentially much less serious. I really wonder what Richard Nixon, or even Robert Muldoon, would have done with the proposed spying powers.

AofA 2013

The invitation-only conference was held in Menorca 27-31 May 2013. I gave a talk there on diagonal asymptotics of combinatorial classes (paper available from my research outputs page). After missing 5 of these meetings in a row, it was good to return. The name of Philippe Flajolet was mentioned many times, and it is clear that this research community still misses him very much.

There were many very interesting talks including the longer invited ones, although the schedule was gruelling with too much time sitting down listening. Highlights for me, in no particular rank order, were:

  • Bob Sedgewick’s talk about his MOOC experiences. He urged us all to give it a try, both as producer and consumer of content.
  • Basile Morcrette showing that generating function methods can work for studying even unbalanced urn models, a nice tribute to the vision of Flajolet.
  • The survey talk of Mihyun Kang on phase transition results in random graphs.
  • Philippe Jacquet on green leader election algorithms (standard methods use too much energy in wireless networks).
  • Michael Drmota on singularity analysis of positive algebraic functions.
  • Konstantinis Panagiotiu’s survey of random k-SAT including his recent results with Coja-Oghlan.
  • The excellent organization of Conrado Martinez.

Lowlights: the hotel was isolated and although it had some good features, not completely suited to the conference. It was filled with English tourists many of whom, unfortunately, didn’t really mix well with the intellectual nature of the conference and didn’t understand how to use sunscreen. The weather was cool and the beach under attack from jellyfish who stung at least two conference participants. The talks were held in the piano bar, which had really good seats, but poor acoustics and visibility. The travel to and from Menorca was really arduous, even though I only came from San Francisco.

From the mathematical point of view, there were some interesting topics. The “Algorithms” part of AofA seemed to be even less prominent that previous years, and this may be a problem in future. A talk by Markus Nebel on Yaroslavskiy’s dual pivot quicksort showed that the old models used since the time fo Knuth are not very good at predicting actual performance, and some hard work is desperately needed there. The notion of a tradeoff between accuracy and other performance characteristics versus energy use as mentioned at least twice, and seems a promising approach.

Many community activities are planned. In particular, AofA2014 will be in Paris 16-20 June, with Donald Knuth as the Flajolet memorial lecturer.

Coincidences

Although we know that the probability of an unspecified “unexpected” event is rather high, given how many possible events can occur, it is still interesting to note them. There is probably an evolutionary reason why probability is so unintuitive to us. Perhaps being curious about coincidences had substantial survival value in the past.

Here are two that have occurred to me recently.

1) Just over a week ago I wandered into the Mechanics Institute Chess Club (apparently the oldest in the US) in downtown San Francisco.There were just a few players sitting around casually on a Sunday afternoon, with no events scheduled. Then in walked someone I haven’t had any contact with for 25 years, whom I knew from playing against in the schools tournaments in Christchurch and at the Canterbury Chess Club, who later became NZ champion. Now that is a coincidence – he works for Google in Sydney and was in SF for a conference. Neither of us has played competitively for many times longer than our competitive chess career lasted. Playing a few games with the newfangled clock (3 minutes per player, plus 2 second increment per move) was a fun way to spend an hour.

2) This is close enough to true, and doesn’t change the essential mathematics. I know someone who has been married twice, each time to someone with the same birthday. How likely is that? This is different from the famous “birthday problem”. Suppose that everything is uniformly randomly chosen, and I have $latex k$ acquaintances who are in a position to marry and whom I know well enough that I would hear about such a coincidence. Let $latex n$ be the number of days in a year. The probability that one of these acquaintances fails to have such a coincidence is $latex 1-1/n$, so the probability that some succeeds is $latex 1 – (1-1/n)^k$. Let $latex c$ be a number between $latex 0$ and $latex 1$ that represents our threshold for incredulity – if an event has probability less than $latex c$, I will be surprised to see it, and otherwise not. Thus we should be surprised if $latex (1-1/n)^k > 1-c$. Reasonable values for $latex n,k,c$ are $latex 365, 100, 0.01$, but since $latex (364/365)^{100} = 0.76$, we should not be surprised. Analytically, the inequality $latex (1-1/n)^k > 1 – c$ can be approximately solved via making the approximation $latex (1-1/n)^k = exp(k log(1-1/n) approx exp(-k/n) approx 1 – k/n$. Thus I expect to need $latex k < cn$ in order to be surprised: the number of acquaintances should scale linearly with $latex n$, which clearly doesn’t have to measure days. So if I only care about which month the person is born in, I will never be surprised, but if I care about the hour and have fewer than about 80 acquaintances, very likely I will be surprised. The original case of a day shows that if I have more than 4 acquaintances, I should not be surprised.

Suppose everyone in the world is my acquaintance. How precise could we be about the birthday without being surprised? Now $latex k$ is of the order of 5 billion, so with the same value of $latex c$, $latex n$ should be about 500 billion. That means we can slice up a year into milliseconds and such a coincidence event would not be at all strange.