Why do programming competition contestants use C++ and Java? [closed]

After competing in and following this year's Google Code Jam competition, I couldn't help but notice the incredible number of [successful] contestants that used C/C++ and Java. The distribution of languages used throughout the competition can be seen here.

After programming in C/C++ for several years, I recently fell in love with Python for its readable/straightforward nature. More recently, I learned functional languages like OCaml, Scheme, and even logic languages like Prolog. These languages certainly have their merits and, in my opinion, can be applied more easily than C++ and Java for certain situations. For example, Scheme's use of call/cc simplifies backtracking (a tool required to answer several problems) and Prolog's logic specification, although inefficient due to its brute-force nature, can drastically simplify (and even automatically solve) certain problems that are difficult to wrap one's brain around.

It is clear that a competition contestant should use the tools that are best suited for the challenge. Even x86 assembly is Turing complete - that doesn't justify solving problems with it. In this case, why are the contestants that use less common languages like Scheme/Lisp, Prolog, and even Python significantly less successful than contestants that use C/C++ and Java? Worded differently, why don't successful contestants use languages that, although may be less mainstream, are arguably better tools for the job?

There are several motivations for my question. Most importantly, I would like to become a better programmer - both in the practical aspect and the competition aspect. After being introduced to such beautiful paradigms like functional and logic programming, it is discouraging to see so many people discard them in favor of C/C++ and Java. It even makes me question my admiration for said paradigms, worrying that I cannot be successful as a Lisp/Scheme/Prolog programmer in a programming competition.


Solution 1:

Great question! As someone who has dabbled in programming contests a bit myself, I may have something to say.

[Let's get the standard disclaimer out of the way: contest programming is only loosely related to "programming in the real world", and while it tests algorithmic and problem-solving skills and the ability to come up with fast bug-free working code under time pressure, it does not necessarily correlate with being able to build large software projects, write maintainable code, etc (beyond the fact that well-structured programs are easier to debug).]

Now for some answers:

  • C++/Java are more common than other languages in the real world as well, so you'd expect to see a higher proportion anywhere. (But it's even higher in the contest population.)

  • Many of these participants are students, or got into contests as students, and C++/Java are more common "first languages" that students learn. (Undergrad students these days may start with Scheme, Haskell, Python, etc., but high-schoolers (often self-taught) less often.) In fact, many of the Eastern European participants still use Pascal, and are more amazing with it than the rest of us will ever be with any language.

  • The school- and college-level contests usually use these languages. The International Olympiad in Informatics (IOI) allows only C, C++ and Pascal (or maybe it allows Java now; I haven't kept up), and the ACM Intercollegiate Programming Contest (ACM ICPC) allows only C, C++ and Java. TopCoder allows C++, Java, C# and VB (really :p); and recently, Python. So you could say the "contest ecosystem" has more C++/Java programmers in it. Google Code Jam and IPSC are among the few contests that allow code in any language, actually.

  • Now the question is, in GCJ where the contestants are free to choose a language, why wouldn't they choose Python or Scheme? The most relevant factor is that these languages are slow. Sure, for most real-world programming they are easily fast enough, but for the tight loops that are often involved in getting a program to run under the n-second limit for all test cases, these languages don't cut it for any of the algorithmically more involved problems. (A problem designed to accept O(n log n) solutions but not Θ(n2) solutions for C/C++ frequently rules out even optimal O(n log n) solutions in slower languages. Even Java used to be given a handicap at USACO; I'm not sure this is still the case.)

  • Another factor is the libraries: C++ and Java have better libraries for frequently useful algorithms and data structures (e.g. red-black trees, C++'s next_permutation), while Python's libraries (good enough for the real world) are less useful here, and Prolog and Scheme... I don't know about their libraries. This is a relatively minor factor, because these programmers can write their own code when necessary. :-)

  • General-purpose multi-paradigm languages are more useful for just getting things done within the time constraints of the contest, than languages that force a philosophy or way of doing things on you. This is why Prolog will always remain unpopular, for instance. (General philosophy: some languages are "enabling" languages that let you do anything including shooting yourself in the foot, some are "directing" that force you to do things the right way.) This is also why C++ is three times more popular than Java in the general contest participants, and much more popular among the top contestants. Since code doesn't have to be read by anyone else, it's ok and even useful to have loop macros like FOR(i,n) (less code to type, and more importantly less chance of making a bug when in a hurry). Nothing against Java, there are a few top programmers who use Java too. :-)

  • Finally, although many of these top programmers may have C++/Java/Pascal as their "first language", they are not good because of their language, so you don't have to despair about that. Many of these same programmers have won contests like the ICFP contest even with intentionally using crazy languages like shell scripts, m4 (used in autoconf), and assembly (the team named "You Can't Spell Awesome Without ASM").

Solution 2:

I liked Jerry Coffin's idea of plotting contestants of the Google AI contest, so I took all of the results and plotted them (calculated mean, standard deviation, and then graphed the normal distribution curves in Excel).

With Lua and JS, got this:

Without (there were few contestants, so maybe the results are skewed):

It looks like Java participants did markedly worse than the rest, while Go, Common Lisp, and C are on the better end.

Solution 3:

Why we all speak English and not Esperanto? Well, it just happened so. Even though English is inconsistent and bloated and Esperanto is intentionally designed as 'better tool'.

Thus, one reason is a tradition. In most schools programming is still taught in C/C++, Java, Pascal or even Basic. And participate in those contests mostly students, which choose language they know better.
Also, you can notice that most algorithmic books feature psedudocode in style of Pascal or Ada, and very very rarely - Lisp. I don't know why, perhaps also a tradition. Or perhaps it's just not so good for the algorithms.

Another reason would be speed. Although it's not a problem for Google Code Jam, in almost all contests 2x speed gap is a difference between 'Accepted' and 'Time Limit' verdicts.
In other words, if optimal algorithm in C++ runs 10 times faster than in Ruby, it may mean that sub-optimal algorithm in C++ will still be faster than a good one in Ruby. And contest authors usually don't want to allow O(n^2) submissions, if O(n*logn) can be achieved.

Solution 4:

First, I'd question your premise [edit: or what I take to be a premise -- that contestants using C++ and Java fare about equally well]. For example, here's what languages were used for the entries that came in the first 100 places and the last 100 places in Google's recent AI contest:

alt text

Contestants using C++ and Java did not seem to be anywhere close to equally successful in that contest. Contestants using Python didn't seem to fare particularly well either, though there were considerably fewer of them, weakening any conclusion in that regard.

Second, of course, an awful lot of the explanation (as others have pointed out) is undoubtedly just the number of people who are familiar with each language. There are probably more people taking a course in Java right now than the total number of people who've ever written any Lisp, Scheme or Prolog.

Edit: I think a third possibility is simply versatility. To pick an extreme example, Prolog is very well suited to a few problems, but equally poorly suited to many others. Few people can (or at least do) learn more than one or two languages well enough to use them in a contest, so most people who are interested in such things are likely to choose languages that can work reasonably well for almost anything, rather than attempting to learn a specialized language for every problem that might be chosen.