Why can ALL quadratic equations be solved by the quadratic formula?

In algebra, all quadratic problems can be solved by using the quadratic formula. I read a couple of books, and they told me only HOW and WHEN to use this formula, but they don't tell me WHY I can use it. I have tried to figure it out by proving these two equations are equal, but I can't.

Why can I use $x = \dfrac{-b\pm \sqrt{b^{2} - 4 ac}}{2a}$ to solve all quadratic equations?


I would like to prove the Quadratic Formula in a cleaner way. Perhaps if teachers see this approach they will be less reluctant to prove the Quadratic Formula.

Added: I have recently learned from the book Sources in the Development of Mathematics: Series and Products from the Fifteenth to the Twenty-first Century (Ranjan Roy) that the method described below was used by the ninth century mathematician Sridhara. (I highly recommend Roy's book, which is much broader in its coverage than the title would suggest.)

We want to solve the equation $$ax^2+bx+c=0,$$ where $a \ne 0$. The usual argument starts by dividing by $a$. That is a strategic error, division is ugly, and produces formulas that are unpleasant to typeset.

Instead, multiply both sides by $4a$. We obtain the equivalent equation $$4a^2x^2 +4abx+4ac=0.\tag{1}$$ Note that $4a^2x^2+4abx$ is almost the square of $2ax+b$. More precisely, $$4a^2x^2+4abx=(2ax+b)^2-b^2.$$ So our equation can be rewritten as $$(2ax+b)^2 -b^2+4ac=0 \tag{2}$$ or equivalently $$(2ax+b)^2=b^2-4ac. \tag{3}$$ Now it's all over. We find that $$2ax+b=\pm\sqrt{b^2-4ac} \tag{4}$$ and therefore $$x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}. \tag{5}$$
No fractions until the very end!

Added: I have tried to show that initial division by $a$, when followed by a completing the square procedure, is not a simplest strategy. One might remark additionally that if we first divide by $a$, we end up needing a couple of additional "algebra" steps to partly undo the division in order to give the solutions their traditional form.

Division by $a$ is definitely a right beginning if it is followed by an argument that develops the connection between the coefficients and the sum and product of the roots. Ideally, each type of proof should be presented, since each connects to an important family of ideas. And a twice proved theorem is twice as true.


Here is a slightly less ad-hoc approach to deriving the formula.

You look at the polynomial $ax^2+bx+c$ and you think of it as being composed of two kinds of indeterminates: coefficients $a$,$b$,$c$, and variable $x$. What you wish to do is if $ax^2+bx+c=a(x-r_1)(x-r_2)$ you want find an expression for $r_1$ and $r_2$ in terms of $a,b,c$ involving only the operations $+,-,\times,\div$ and $\sqrt[n]{}$.

But how are $r_1$ and $r_2$ related to $a,b$ and $c$? If you look at the expression $ax^2+bx+c=a(x-r_1)(x-r_2)$, it is easy to compute that $b=-a(r_1+r_2)$ and $c=ar_1r_2$.

Intuitively because you know that $(r_1+r_2)=-\frac ba$, determining $r_1$ and $r_2$ is the same as determining $(r_1-r_2)$. Let $E=(r_1-r_2)$ and note that $2r_1=(r_1+r_2)+(r_1-r_2)=-\frac ba+E=$ and $2r_2=(r_1+r_2)-(r_1-r_2)=-\frac ba-E$, so we already have most of our quadratic formula: $$r_1,r_2=\frac{-b}{2a}\pm\frac{E}2$$

All we need to do then, is express $E=(r_1-r_2)$ using $+,-,\times,\div,\sqrt[n]{}$ in terms of $a,b,c$. In order to do this, we need to take a small detour to see what expressions in $+,-,\times,\div$ and $a,b,c$ could possible be.

Note that the coefficients $b=-a(r_1+r_2)$ and $c=r_1r_2$ are symmetric functions in $r_1$ and $r_2$ in the sense that if you exchange $r_1$ with $r_2$ for each other, the values of $b$ and $c$ do not change. Furthermore, $b$ and $c$ are in fact scalar multiples of the so-called elementary symmetric functions, which have the property that any symmetric function (in $2$ variables) can be expressed uniquely as a polynomial (quotient of polynomials for our purposes) in them.

In particular, we can "symmetrize" the quantity $E=(r_1-r_2)$ to obtain the discriminant $D=(r_1-r_2)^2$ which is in some sense "the smallest" symmetric function of $r_1$ and $r_2$ that becomes 0 if $r_1=r_2$. Technically, though, the above is the discriminant only when $a=1$ because our coefficients $b$ and $c$ are elementary symmetric functions scaled by $a$, so we define the general discriminant to be $D=a^2(r_1-r_2)^2$. Because $D$ is symmetric and $b$ and $c$ are (up to a multiplicative factor) elementary symmetric, we should be able to express $D$ as a polynomial in $b$ and $c$.

We do so in a somewhat ad-hoc matter (though there are algorithms that will do this procedurally): $$D=a^2(r_1-r_2)^2$$ so $$D=a^2(r_1^2-2r_1r_2+r_2^2)$$ hence $$D=a^2(r_1^2+2r_1r_2+r_2^2-4r_1r_2)$$ and finally $$D=a^2(r_1+r_2)^2-a^24r_1r_2$$ giving us $$D=b^2-4ac$$

Evidently, now we have that $\sqrt{D}=a(r_1-r_2)=aE$ and so $E=\frac{\sqrt{D}}a$. This allows us to rewrite our formula so far to get from $$r_1,r_2=\frac{-b}{2a}\pm\frac{E}{2}$$ to $$r_1,r_2=\frac{-b}{2a}\pm\frac{\sqrt{D}}{2a}$$ and finally $$r_1,r_2=\frac{-b\pm\sqrt{b^2-4ac}}{2a}$$


The only strange question is: why did we only have to take one square root in order to get the formula, i.e. why did the quantity $E=(r_1-r_2)$ turn out to be a square root of a nice polynomial in $a,b,c$? That is where modern Galois theory comes in.

What's really happening is this: the first four suggest that you think of the coefficients as living in the field $F$ (a set of expressions such that adding, subtracting, multiplying, or dividing any two of them gives another expression in the set) consisting of $\{\dfrac {p(a,b,c)}{q(a,b,c)}\}$ where $p$ and $q$ are polynomials in three variables (and rational coefficients). Then $r_1$ and $r_2$ will generate an extension field $E$ of $F$, that is, the smallest field $E$ that contains $F$ and also $r_1$ and $r_2$. Galois theory says that this extension field $E$ will be a $2!=2$-dimensional vector space over $F$ and hence a single square root will be sufficient to generate $E$. Thus we need an expression in the coefficients (symmetric expression in the roots) whose square root is an expression in the roots, but not symmetric, and a natural choice then is the most elementary anti-symmetric function known as the Vandermonde determinant which is precisely $(r_1-r_2)$ in this case (anti-symmetric=swapping two variables flips the sign, obviously the square of an anti-symmetric function is a symmetric function).

For general polynomials, the extension field will be of higher dimension, and so you will need to take possibly several roots of different orders. Galois theory allows us to compute what these roots ought to be and in what order (giving us the cubic and quartic formulas in a way that is not ad-hoc at all), and also shows that the general degree $5$ and above polynomial does not have a formula involving only $+,-,\times,\div,\sqrt[n]{}$. (Some people feel frightened by this, because taking roots should invert the raising of powers, but this is not the case because the order of operations matters...) Now, if the coefficients of the higher degree polynomial satisfy some additional relations (i.e. are not completely independent from each other), then Galois theory also gives procedures for computing formulas for those cases and also for determining what such relations ought to be.


Proof without words.

completing the square

This one shows that $$ax^2+bx+c=a\left(x+\dfrac b{2a}\right)^2+c-\dfrac{b^2}{4a}$$ from which the quadratic formula can be easily derived.

Credits to LucasVB.

I hope this helps.
Best wishes, $\mathcal H$akim.


Probably the easiest way to understand where the quadratic formula comes from is by 'completing the square': solving equations of the form '$x^2$=whatever' is easy, so let's see if we can put our quadratic equation ($ax^2+bx+c=0$) in that form.

The first thing to do is divide by $a$; of course this doesn't work if $a=0$, but then if that's the case our formula wasn't quadratic in the first case! This gives us $x^2+{b\over a}x+{c\over a}=0$. Now, that $b\over a$ term keeps us from having a clean square - but if we remember how to square a sum of two numbers - $(m+n)^2=m^2+2mn+n^2$ - then by substituting $x$ for $m$, we can see that our $n$ should be half of the linear term: $(x+{b\over 2a})^2 = x^2+{b\over a}x + {b^2\over 4a^2}$. But now the constant term isn't right; we have to adjust it to make it $c\over a$. A correction of $({c\over a}-{b^2\over 4a^2})$ will do this; we get $(x+{b\over 2a})^2+({c\over a}-{b^2\over 4a^2}) = 0$.

But this is exactly what we wanted; we can move that second term over to the right and get $(x+{b\over 2a})^2 = {b^2\over 4a^2}-{c\over a}$. Getting the right-hand-side cleaned up a little bit makes it ${b^2-4ac\over 4a^2}$ - just multiply the numerator and denominator of $c\over a$ by $4a$ and combine terms. Now, we can go ahead and take the square root of both sides: $x+{b\over 2a} = \pm \sqrt{b^2-4ac\over 4a^2} = \pm {\sqrt{b^2-4ac}\over\sqrt{4a^2}} = {\pm\sqrt{b^2-4ac}\over 2a}$. The last step is to subtract $b\over 2a$ from both sides, finally giving the familiar: $x = {-b\pm\sqrt{b^2-4ac}\over 2a}$


The other answers tell you where the formula "comes from" (namely, from completing the square). If you are just happy checking that the formula gives the correct solutions whatever $a$, $b$ and $c$, you may verify that the identity $$ aX^2+bX+c=a\left(X-\frac{-b+\sqrt{b^2-4ac}}{2a}\right)\left(X-\frac{-b-\sqrt{b^2-4ac}}{2a}\right) $$ holds for every $a$, $b$ and $c$.