Why is one proof for Cauchy-Schwarz inequality easy, but directly it is hard?

I thought I never did it directly, but now that I have found the solution below (quite quickly), I begin to suspect that I must have done something similar years ago.

Anyway, as a first step, let's square everything. Then we need to prove this: $$ \left(\sum_i x_i y_i\right)^2 \leq \left(\sum_i x_i^2\right)\left(\sum_j y_j^2\right). $$ Let's subtract the left from the right and open all the parentheses: $$ \begin{align*} \left(\sum_i x_i^2\right)\left(\sum_j y_j^2\right) - \left(\sum_i x_i y_i\right)^2 & = \sum_{i,j}x_i^2 y_j^2 - \sum_{i, j}x_i y_i x_j y_j \\ & = \sum_{i \neq j} x_i^2 y_j^2 - \sum_{i \neq j} x_i y_i x_j y_j \\ & = \sum_{i < j} (x_i^2 y_j^2 + x_j^2y_i^2 - 2 x_i y_i x_j y_j) \\ & = \sum_{i < j} (x_i y_j - x_j y_i)^2 \end{align*} $$ Here indices $i$ and $j$ always iterate from $1$ to $n$. We see that this is a sum of several squares, so it is nonnegative, proving the original inequality.

UPDATE: the very same thing can be done for complex numbers. We want to prove this: $$ \left|\sum_i x_i \overline{y_i}\right| \leq \sqrt{\sum_i x_i \overline{x_i}} \cdot \sqrt{\sum_j y_j \overline{y_j}}. $$ Let us square everything, keeping in mind that $|z|^2 = z\overline{z}$: $$ \left( \sum_i x_i \overline{y_i} \right) \left(\sum_j \overline{x_j} y_j\right) \leq \left( \sum_i x_i \overline{x_i} \right) \left(\sum_j y_j \overline{y_j}\right) $$ As before, we subtract the left from the right: $$ \begin{align*} & \left( \sum_i x_i \overline{x_i} \right) \left(\sum_j y_j \overline{y_j}\right) - \left( \sum_i x_i \overline{y_i} \right) \left(\sum_j \overline{x_j} y_j\right) \\ & = \sum_{i,j}x_i\overline{x_i}y_j\overline{y_j} - \sum_{i, j}x_i \overline{y_i} \overline{x_j} y_j \\ & = \sum_{i \neq j} x_i\overline{x_i}y_j\overline{y_j} - \sum_{i \neq j} x_i \overline{y_i} \overline{x_j} y_j \\ & = \sum_{i < j} (x_i \overline{x_i} y_j \overline{y_j} + x_j \overline{x_j} y_i \overline{y_i} - x_i \overline{y_i} \overline{x_j} y_j - x_j \overline{y_j} \overline{x_i} y_i) \\ & = \sum_{i < j} |x_i y_j - x_j y_i|^2 \end{align*} $$ As before, we have a sum of squares of real numbers, which is real and nonnegative. Done.


It's not hard. In fact the first proof that I encountered in high school was without using inner products. It goes as follows:

Consider the quadratic $$\sum_{i=1}^{n} (a_ix+b_i)^2 = (\sum_{i=1}^{n}a_i^2)x^2 + 2 (\sum_{i=1}^{n}a_ib_i)x + (\sum_{i=1}^{n}b_i^2)$$

Since the quadratic expression is always non negative, its discriminant must be $\leq 0$ [1]. i.e. $$(\sum_{i=1}^{n}a_ib_i)^2 - \sum_{i=1}^{n}a_i^2\sum_{i=1}^{n}b_i^2 \leq 0$$

from where the inequality follows. Equality occurs iff $$x = \frac{b_i}{a_i} \forall i$$

[1] This follows because its non-negativity implies that it never crosses the x-axis (which means that it has either no real roots, in which case the discriminant is negative, or it has a double root, in which case the discriminant is 0).


It seems weird that it should be easier to define a lot of new terms, just to prove an inequality.

In general you can find many examples of problems (even inequalities) that are solved easier with the use of some mathematical machinery.

Dan Shved gave an excellent answer to your question. Let me just add that you could prove the inequality for n=2 and then use induction on n.

The trick in Sandeep Thilakan's answer can be used to prove the inequality in any inner product space.

A very nice didactic book about inequalities is The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities. Its first chapter is very relevant to your question.


Essentially Dan's prove doesn't avoid inner products, since he proved that $\|x\|^2\|y\|^2-\langle x,y\rangle^2=$ Gram-determinant of $(x,y)$. So what qualifies my statement as an answer? Even if you don't use the notation of an inner product, it's inherently present at all.


Here is another variant, which nicely illustrates how it is sometimes sufficient to prove a seemingly weaker inequality by exploiting its symmetries:

By Young's inequality (which is a simple consequence of $(\lvert x_k\rvert-\lvert y_k\rvert)^2\geq 0$ in this case), we have $\lvert x_k y_k\rvert\leq \frac 1 2x_k^2+\frac 1 2 y_k^2$. This implies $$ \left\lvert\sum_k x_k y_k\right\rvert\leq \frac 1 2 \sum_k x_k^2+\frac 1 2 \sum_k y_k^2. $$ Of course, the left side stays unchanged if we replace $x_k$ by $\lambda^{1/2} x_k$ and $y_k$ by $\lambda^{-1/2} y_k$ for $\lambda>0$. Thus $$ \left\lvert\sum_k x_k y_k\right\rvert\leq \frac {\lambda} 2 \sum_k x_k^2+\frac 1{2\lambda} \sum_k y_k^2. $$ In particular, if $\lambda=\left(\sum_k x_k^2\right)^{-1/2}\left(\sum_k y_k^2\right)^{1/2}$ [see remark below], then $$ \left\lvert\sum_k x_k y_k\right\rvert\leq \left(\sum_k x_k^2\right)^{1/2}\left(\sum_k y_k^2\right)^{1/2}. $$ The same argument works in arbitrary inner product spaces, just that one uses the Young-type inequality $\lvert\langle x,y\rangle\rvert\leq \frac 1 2 \lVert x\rVert^2+\frac 1 2 \lVert y\rVert^2$, which is an easy consequence of the semi-definiteness of the inner product.

Using Young's inequality $\lvert x_k y_k\rvert\leq \frac 1 p \lvert x_k\rvert^p+\frac 1 q \lvert y_k\rvert^q$ with dual exponents $p$ and $q$, one can also obtain Hölder's inequality along the same lines.

Remark: Of course $x$ should not be zero here. The case $x=0$ can either be treated separately or one can take $\lambda=\left(\sum_k x_k^2+\epsilon\right)^{-1/2}\left(\sum_k y_k^2\right)^{1/2}$ and let $\epsilon\searrow 0$ in the end. Also, this choice of $\lambda$ is not arbitrary, it is the value you get if you minimize the right side in $\lambda$. In this sense Cauchy-Schwarz is the optimal form of Young's inequality with respect to the dilation symmetry $x\mapsto \lambda x$, $y\mapsto \lambda^{-1}y$ of the left side.