Conjectures (or intuitions) that turned out wrong in an interesting or useful way
Solution 1:
From coding theory --
If you want to transmit a message, say 0 or a 1, through a noisy channel, then the natural strategy is to send messages with redundancy. (A noisy channel takes your message and brings it to its destination, though with a chance of scrambling some of its letters.)
For example, suppose I wanted to transmit to you $0$ or $1$ -- $0$ for upvote and $1$ for downvote, say, though this narrative choice is mathematically irrelevant.
Suppose our channel is such that the chance of a single transmitted zero changing to a one is $.1$. If I send you the message $00000$, as long as 3 of the zeros remain zeros, you will be able decode my message by majority voting with a greater probability than if I just sent you a single $0$.
(Simlrly to hw you cn rd ths?)
The collection $\{00000,11111\}$ is what a code is, and $00000$ is called a codeword.
By continuing this way (for example by sending a million zeros to transmit the message zero), you would be able to drive the probability of a miscommunication as small as you want, however, at the cost of requiring a longer and longer transmission time to send the same message (zero or one).
If you call the rate this ratio ( the number of messages you send ) / (the number of symbols in your message), this naive coding method described lets you make the error small but at the cost of sending the rate to zero.
According to sources I've read (but how historically accurate they are I cannot say), practitioners in the field of communications (e.g. telegraph engineers) thought that this trade off was absolute - if you want to reduce errors, you must drive the rate to zero.
This is a plausible thing to conjecture after seeing the above example - or anyway you could probably get people (students) to nod along if you told showed them this example and told them this false conclusion confidently. (Conducting such an experiment would be unethical, however.)
However, in 1948 Shannon proved a remarkable theorem saying that this trade-off was not so absolute. (The noisy channel coding theorem.) Indeed, for any channel there is a number called the "capacity" (which in the discrete case can be computed from the probabilistic properties of the channel as an optimization problem over probability distributions on a finite set), and as long as you do not try to send at a rate better than this capacity, you can find ways to encode your messages so that the error becomes arbitrarily small. He also proved that this capacity bound was the best possible.
One catch is that you need to have a lot of messages to send; if you want to send one of two messages over your channel (and then close the channel for business), then you really cannot do better than the naive method above. Shannon's proof is asymptotic, asserting that as the size of the set of possible messages goes to infinity, you can (in principle) find encoding methods that make the error low - morally, you can keep the code words far from each other without using too many extra bits to pad the transmissions against errors, because there is a lot of extra room in higher dimensions.
His proof opened up a huge amount of research into explicit constructions of codes that approach his bound, which has connections to many other topics in mathematics and computer science.
https://en.wikipedia.org/wiki/Noisy-channel_coding_theorem
If you want to learn the precise statement, the book Cover T. M., Thomas J. A., Elements of Information Theory, is excellent.
Solution 2:
The first thing that came to mind was the proof that a general expression for the roots of a quintic (or higher) doesn't exist (Abel-Ruffini and Galois). People in previous centuries had thought it might be possible, and searched for the solution. The area of maths that Galois sparked was revolutionary.
Solution 3:
Kurt Gödels Incompleteness theorems in mathematical logic came as a shock to many mathematicians in a time when formalistic optimism ruled:
The theorems state that a sufficiently powerful theory always must have statements which are possible to express but impossible to prove / disprove, making mathematics a forever unfinishable "game" or puzzle as any of those unprovable statements can be chosen to be added as an axiom.
Solution 4:
Maybe not the answer you expected on a mathematical site, but it feels fitting.
In the 19th century, scientists were convinced that all matter obeys Newtonian laws. The only small problem was a pesky anomaly: all equations predicted that black bodies should glow bright blue, but in reality they don't.
Max Planck was one of the people who tried to solve it. He tried using the equation
$E=nhf$
with $E$ for the energy, $n$ for any integer, $f$ for the particle's frequency, and $h$ for an arbitrary constant.
Planck's assumption was not justified by any physical reasoning, but was merely a trick to make the math easier to handle. Later in his calculations Planck planned to remove this restriction by letting the constant $h$ go to zero. [...] Planck discovered that he got the same blue glow as everybody else when $h$ went to zero. However, much to his surprise, if he set $h$ to one particular value, his calcuation matched the experiment exactly (and vindicated the experience of ironworkers everywhere). --- Herbert Nick, "Quantum Reality"
That's how Planck discovered Planck's constant, and as a side effect, quantum physics.
Solution 5:
The idea that the motion of every celestial body must be a perfect circle, and that, when it wasn’t, the way to “save the appearances” was to add epicycles around the cycles, turned out not to be a useful theory.
However! It turns out that calculating epicycles (in a slightly different way than the Aristoteleans historically did) is equivalent to finding the Fourier transform of a planet's motion, and converting it from the time domain to the frequency domain.