Theoretical computer science from the outside

I have recently made some critical comments on blogs in the “theoretical computer science” area. One feature of blogs is that they encourage rapid dissemination. Unfortunately it is easy to write inaccurately, and I don’t want to offend unnecessarily. So here is an attempt to set out the argument better. It is an opinion, based on a few years of observation, but I don’t claim to be able to convince anyone by proof. I spent a relatively long time to write it, and now need to get back to doing research instead of talking about it!

Background: I come from a pure math (abstract algebra) background, moved into combinatorics (mostly generating functions) and also like to work on social choice and analysis of algorithms. I work in a computer science department and teach courses in “theoretical computer science”, mostly algorithms. I have enjoyed learning what little I know now about TCS, but I still consider myself primarily a mathematician.

I have no problem with considering “theoretical computer science” as a branch of mathematics, judged by mathematical standards. However it seems that the TCS community overall doesn’t like to do this, preferring to be considered part of computer science (whatever the latter is, but no time to debate that now). It isn’t clear to me why there even needs to be a separate branch (I don’t really understand “theoretical physics” either). A better classification would be based on the problems studied, not the methodology.

However, given this classification I see some strange features of TCS practice. These observations are not unique to me. Michael Mitzenmacher has discussed some of them in detail in his blog.

  • Too much publication of too few results (an argument strongly made in recent years by Neal Koblitz among others). TCS researchers seem to publish more than mathematicians, and it is very noticeable that they publish many more conference papers and relatively few in journals (broadly speaking, the latter take longer to produce and involve more stringent refereeing, and tend to have fewer errors and omissions). This may not be bad for science overall (as argued often by Michael Mitzenmacher) but it surely makes it difficult when comparing the performance of researchers. Why not publish in a journal with tougher standards of peer review? Blogs, the arXiv, etc all exist, so dissemination of ideas is fairly easy these days – is it necessary to claim that conference presentations really count as publications? I find it particularly annoying that people claim that a journal paper that is a minor variation on a conference paper should also be counted as another publication. Of course, they never admit the “minor” part.
  • Not much link between theory and practice. Theoretical computer scientists prove a lot of theorems, but the mathematical depth of a lot of the papers is not particularly high, in my opinion, compared to what mathematicians publish (and less checking goes on in conference refereeing than for journals). That would be absolutely fine if the papers were also subjected to empirical testing. But how many researchers who devise new algorithms actually have them implemented (if not themselves, by a student, for example)? (This has also been well discussed at My Biased Coin). I find the term “theoretical computer science” a little strange. It should be balanced by “experimental computer science”. Despite the major article in CACM on this (Nov 2007), it is far from clear that such a subject is thriving. Some areas, such as computational complexity, are very hard to test empirically – a result saying that a certain problem is NP-hard doesn’t generate many testable hypotheses, at least not in my brain.
  • Very little attempt at popularization. David Harel’s book Computers Ltd is one of small handful of popular books about big ideas in computer science. Contrast this with the huge number of popular books on (often quite difficult) mathematics. In fact the whole idea that “Computer Science” is a science is not really understood by the public – many conflate it with “I.T.” and the engineering aspects. Given the lack of scientific method discussed above, perhaps this is a fair reflection of reality – but it shouldn’t be.
  • Not much linkage with other areas of CS. I have too many colleagues who don’t know the difference between big-O and big-Omega. Few of them, if any, care about TCS enough to attend a seminar or even a public talk on algorithms given by a great speaker like Bernard Chazelle. And yet TCS ideas are essential for progress in many areas of CS, I have no doubt. This is of course not simply the fault of people in TCS, but in CS generally.
  • Not very unified for such a young field. It looks to me as though “theoretical computer science” roughly encompasses computational complexity (classifying the difficulty of problems), algorithms (finding solutions to problems), and logic/semantics/programming languages. It seems that the last topic has relatively little coverage in the USA (see the comments at Computational Complexity); the first has advanced to a degree that is mostly very far removed from any sort of application (the models it uses are also suspect), and probably should be judged by the standards of mathematicians (in which case it might not do as well as it has to date); the second is mostly split between people who devise complicated (never implemented) algorithms, with basic big-O analysis only,and obsession over the worst case (predominant in the USA), those who analyze to death rather simple algorithms using heavier machinery and “average case” (always the uniform distribution, and more common outside the USA), and those who do experimental analysis (a seemingly small and not well regarded part of the overall community). There is little interaction between these groups. Division of labour is fine – the same person doesn’t have to prove lower bounds, then devise a nice algorithm, then analyse it in Knuth style, then implement it. But these people should be talking more, and it is time to ensure that more progress is made, more efficiently.