Why are mathematical proofs that rely on computers controversial?

There are many theorems in mathematics that have been proved with the assistance of computers, take the famous four color theorem for example. Such proofs are often controversial among some mathematicians. Why is it so?

I my opinion, shifting from manual proofs to computer-assisted proofs is a giant leap forward for mathematics. Other fields of science rely on it heavily. Physics experiments are simulated in computers. Chemical reactions are simulated in supercomputers. Even evolution can be simulated in an advanced enough computer. All of this can help us understand these phenomena better.

But why are mathematicians so reluctant?


What is mathematics? One answer is that mathematics is a collection of definitions, theorems, and proofs of them. But the more realistic answer is that mathematics is what mathematicians do. (And partly, that's a social activity.) Progress in mathematics consists of advancing human understanding of mathematics.

What is a proof for? Often we pretend that the reason for a proof is so that we can be sure that the result is true. But actually what mathematicians are looking for is understanding.

I encourage everyone to read the article On Proof and Progress in Mathematics by the Fields Medalist William Thurston. He says (on page 2):

The rapid advance of computers has helped dramatize this point, because computers and people are very different. For instance, when Appel and Haken completed a proof of the 4-color map theorem using a massive automatic computation, it evoked much controversy. I interpret the controversy as having little to do with doubt people had as to the veracity of the theorem or the correctness of the proof. Rather, it reflected a continuing desire for human understanding of a proof, in addition to knowledge that the theorem is true.

On a more everyday level, it is common for people first starting to grapple with computers to make large-scale computations of things they might have done on a smaller scale by hand. They might print out a table of the first 10,000 primes, only to find that their printout isn’t something they really wanted after all. They discover by this kind of experience that what they really want is usually not some collection of “answers”—what they want is understanding.

Some people may claim that there is doubt about a proof when it has been proved by a computer, but I think human proofs have more room for error. The real issue is that (long) computer proofs (as opposed to, something simple like checking a numerical value by calculator) are hard to keep in your head.

Compare these quotes from Gian-Carlo Rota's Indiscrete Thoughts, where he describes the mathematicians' quest for understanding:

“eventually every mathematical problem is proved trivial. The quest for ultimate triviality is characteristic of the mathematical enterprise.” (p.93)

“Every mathematical theorem is eventually proved trivial. The mathematician’s ideal of truth is triviality, and the community of mathematicians will not cease its beaver-like work on a newly discovered result until it has shown to everyone’s satisfaction that all difficulties in the early proofs were spurious, and only an analytic triviality is to be found at the end of the road.” (p. 118, in The Phenomenology of Mathematical Truth)

Are there definitive proofs? It is an article of faith among mathematicians that after a new theorem is discovered, other simpler proofs of it will be given until a definitive one is found. A cursory inspection of the history of mathematics seems to confirm the mathematician’s faith. The first proof of a great many theorems is needlessly complicated. “Nobody blames a mathematician if the first proof of a new theorem is clumsy”, said Paul Erdős. It takes a long time, from a few decades to centuries, before the facts that are hidden in the first proof are understood, as mathematicians informally say. This gradual bringing out of the significance of a new discovery takes the appearance of a succession of proofs, each one simpler than the preceding. New and simpler versions of a theorem will stop appearing when the facts are finally understood. (p.146, in The Phenomenology of Mathematical Proof).

In my opinion, there is nothing wrong with, or doubtful about, a proof that relies on computer. However, such a proof is in the intermediate stage described above, that has not yet been rendered trivial enough to be held in a mathematician's head, and thus the theorem being proved is to be considered still work in progress.


This answer began as a sequence of comments to a great article linked by Joseph Geipel in his answer. Eventually I decided to leave it as an answer in its own right.

It is a great article: I found it to be very balanced and accurate, and they did a good job of finding the right people to interview: thoughtful, veteran mathematicians at several points on the computers/no-computers spectrum. (Well, to be fair the "no computers guy" was rather gentle in his expression of his views. If they had looked hard enough they could probably have found some eminent mathematician to take a more extreme "computers are gross" stance. I didn't miss hearing that viewpoint!) But I didn't see anything in the article defending the position that computer-assisted proofs are inherently controversial. I think that is in fact a backward position that very few contemporary mathematicians take and that the profession as a whole has moved past.

My former colleague Jon Hanke makes a great point in the article about not blindly trusting computer calculations. But (as Jon well knows) that point can and should be made much more widely: it applies to any long, difficult calculation or intricate casewise argument for which the full details have not been provided or with which the community of referees chooses not to fully engage.

Let me respond briefly to some of the arguments expressed here (and certainly elsewhere) against computer assisted proofs.

The goal of mathematical proof is increased insight and understanding, and computer-assisted proofs just give us the answer.

There is something to this sentiment, but it is aiming at the wrong target. Mathematics has a venerable tradition of long, intricate computational arguments. Would it be better to have shorter, more conceptual, more insightful proofs instead? Of course it would! Everyone wants that. But proving theorems is really, really hard, and the enterprise of mathematics should be viewed as an accretion of knowledge over time rather than a sequence of unrelated solutions to problems.

"Pioneer work is clumsy" someone famous said: the first proof of an important result is very often not the best one in any way: it is more computational, or longer, or harder, or uses extraneous ideas,...,than the proofs that the community eventually finds. But finding the first proof -- even a very clumsy one indeed -- is clearly one of the most important steps in the accretion process. Gauss's first proof of quadratic reciprocity was by induction! But he proved it, and in so doing it opened the door to later and better proofs (including another six or so of his own). Gauss's Disquisitiones Arithmeticae is full of what to contemporary eyes looks like long, unmotivated calculations: nowadays we can give proofs which are shorter and less computational (most of the time: Gauss was such a blazing pioneer that he found many results that he had "no right to find", and his work retains a distinctly nontrivial aura to this day).

People who argue against computer assisted proof for this reason seem to have the wrong idea about the culture of mathematics. One might naively worry that the Appel-Haken long computational proof of the Four Color Theorem would have stopped people from thinking about this problem and that for ever after we would be stuck with this long, uninsightful (apparently; I am not an expert here) argument. But of course that's not what happened at all: their work increased the interest in the Four Color Theorem, and by now we have a still-computer-assisted but much simpler proof by Robertson, Sanders, Seymour, and Thomas, as well as a fully formalized proof by Gonthier. (One of the main differences in their proof from the original proof -- which I did not fully appreciate until I got some helpful comments on this answer -- is that the Appel-Haken proof also had extensive and poorly documented hand computations. This then becomes an instance of "doing one's computational business properly" as discussed elsewhere in my answer.)

A key point is that you don't need a computer to give a long, uninsightful, calculational proof: lots of mathematicians, including some of the very best, have offered such proofs using hand calculation. Because computers are so much better at calculations than human beings, we can use computers to calculate in different and deeper ways than is possible by hand.

Doron Zeilberger likes to exhibit single screens of code that prove more general versions of theorems that appear in published papers, often with lengthy (and certainly computational) proofs. Sometimes though Zeilberger (whom I have corresponded with and I hold to be a truly great mathematician; he has, though, decided to take the extreme stance in this particular debate, which I feel is not always to the credit of his cause) can be disingenuous in describing the origin of this code: he looks forward to the day when his computer will itself read and respond to the work of these human colleagues and figure out for itself what calculations to do. It is reasonable to expect that something like this will occur at some point in the future. But he sometimes likes to facetiously pretend that that's what's happening in the present. Of course it isn't: Zeilberger is brilliantly insightful on how to find exactly the right routine calculation which, if it works, will essentially immediately prove a big theorem. What he's doing is actually much more impressive than he makes it look: he's a little like Harry Potter using real magic to do what appears to you to be a card trick. But he's making a very important point: what you can do with computers is absolutely confluent with the goals of proof and progress in mathematics, especially increased conceptual understanding.

Computer proofs are less trustworthy than human proofs, because we cannot check what the computers are doing.

This argument really is twaddle, it seems to me. First of all, computers are as good as the people who program them, and there are good and bad ways to do one's programming business (again, this is Jon Hanke's point). What is frustrating (to Jon, and also to me) is that the mathematical community has been slow to regulate this programming business, to the extent that it is possible to publish a paper saying "Proof: I got my computer to do this calculation. (You can write to me if you want to see the details.)" Is that good practice? Of course not. But it's not particular to computers. It is indicative of a larger standard of careless and disdain that we have for calculations over (easier, in many cases) big ideas. Many of us when refereeing a paper -- even an important one -- are too lazy or too snobbish to actually roll up our sleeves and check a calculation. (I have been a referee for a paper which turned out to have a significant error, coming from a calculation that I was too snobbish to check. At first I was actually relieved to hear that the mistake was "only in a calculation": in some sense it hadn't really gotten by me. And then I realized how thoroughly stupid that was, and that the error "hadn't really gotten by me" in the same sense that a ball doesn't really get by the shortstop who has decided that he won't even make a play for it: in some narrow technical sense it may not be an error on my part, but the outcome is the same as if it were. Defensive indifference should not be a proud mathematical position.)

If you do your computer proof well it becomes so much more reliable than a long hand calculation that probably no human being will even seriously claim to have checked. Just ask Thomas Hales, who proved the Kepler Conjecture and published it in the Annals...but the referees admitted that they couldn't check all the details, so eventually they gave up and published it anyway. He has spent a lot of time since then working on a proof that can be checked by a computer proof assistant in the technical sense. Of course this type of proof is more reliable than a messy hand calculation: isn't it just silly to think otherwise?

The flip side of this coin is that many results which are held up as being pillars of conceptual mathematics have proofs that are not perfectly reliable. In mathematics we have a tradition of not speaking badly on each other's work in a subjective way. I can't publish a paper saying that I find your paper to be fishy and I encourage others not to accept it. I need to actually say "Theorem X.Y is faulty". That gentility has negative consequences when you combine it with other bad practices; nevertheless overall I really like it. But for instance when in the 1980's people talked about the grand completion of the classification of finite simple groups, they were being very gentle indeed. There was a key part of the proof that was never published but appeared only privately in an incomplete manuscript of about 200 pages in length. In an interview about ten years later (I believe), Serre made the point -- nicely, of course -- that a 3000 page proof in which 200 pages are missing [and missing because there may be a snag somewhere!] is really not a complete proof, and he speculated rather wryly that the question was not whether there was a gap in the proof -- of course there was -- but whether the gap was large enough to allow a further finite simple group to pass through it! Things were later patched up better in the "second generation proof", and there is a better -- shorter, more conceptual, more powerful -- "third generation proof" coming. Pioneer work is clumsy.

I honestly think that if circa 1982 you believed more in the classification of finite simple groups than the four color theorem then you would simply be showing religious prejudice. There is no rational justification.

Again, the key point here is that the dichotomy between computers / not computers is a very false one: if we do our business correctly, we can use computers to make many (I won't say "most" or "all", but who knows what the future will bring) of our arguments more reliable.

Added: Thanks to some comments from Will Orrick, I looked (for the first time) at the RSST paper. Their introduction contains some issues that I was unaware of. In particular they say that they were motivated by doubts on the validity of the Appel-Haken proof. For this they cite two reasons (I may as well quote):

(i) Part of the A&H proof uses a computer and cannot be verified by hand, and (ii) even the part of the proof that is supposed to be checked be hand is extraordinarily complicated and tedious, and as far as we know, no one has made a complete independent check of it. Reason (i) may be a necessary evil, but reason (ii) is more disturbing...

I find it remarkable how closely these comments parallel the points I made in my answer. As for (i): sure, yes, we would like a pure thought proof. As yet we don't have one, and a computationally intensive proof is much better than no proof at all. As for (ii): that is exactly the point that Jon Hanke is still trying to make today! In fact, the real issue with their proof is that given that it is highly computational, they did not use computers as much as they should have. (I hope it is clear that I am not really criticizing Appel-Haken. As I said several times, pioneer work is clumsy, and their work was pioneer work in a very strong sense.) The latter computer assisted proofs really do let the computers do the computation, which especially in the case of Gonthier's proof is a big improvement.


Here's a great article on why.

There are several reasons. One of the biggest ones is that how one solves the problem is frequently more interesting and useful than the actual result. If that how is "we did an exhaustive search of all the possibilities," you don't really get much other than the result. There's also the risk that relying on computers removes the need to improvise with novel ideas.

Another reason is that bugs and faults in the computing are sometimes considered too much of a risk to mathematical certainty.


See also this article:

Ugly Mathematics: Why Do Mathematicians Dislike Computer-Assisted Proofs?
The Mathematical Intelligencer, December 2012, Volume 34, Issue 4, pp 21-28

Here is an abstract found here:

The author discusses an analogy between narratives and mathematical proofs that tries to account in a simple manner for the ugliness of computer-assisted proofs. He mentions that the ugliness is not essentially associated to methodological or epistemic problems with the evidence. He states that nonbeautiful proof may just be an uninspiring toward where mathematicians reveal indifference.

For another summary, see this ZMATH review.


I think the big issue here is not whether computer assistance is used, but whether the resulting proof is human-comprehensible: it is quite unsatisfying to have a proof that you cannot actually wrap your head around, and can certainly leave one wondering if there aren't bugs in the software.

I don't think anybody really minds the use of a proof assistant where it can generate reasonably-comprehensible proofs, and is just used out of laziness and/or in order to prevent stupid mistakes.

(And of course a proof which cannot itself be comprehended by humans really shouldn't be trusted at all unless the code to generate it is available so that the output can be checked using various proof-checking tools, like the Coq kernel.)

Note that the 4-colour theorem has now been proven using Coq, not just with that implausibly-large "hand-checked" computer output.