CSB inquality: is $\|x\|^2\|y\|^2 - \langle x,y \rangle^2$ a square in any obvious way?

Suppose $x=(x_1,x_2),y = (y_1,y_2) \in \mathbb{R}^2$. I noticed that \begin{align*} \|x\|^2 \|y\|^2 - \langle x,y \rangle^2 &= x_1^2y_1^2 + x_1^2 y_2^2 + x_2^2 y_1^2 + x_2^2 y_2 ^2 - (x_1^2 y_1^2 + 2 x_1 y_1 x_2 y_2 + x_2^2 y_2^2) \\ &=(x_1 y_2)^2 - 2x_1 y_2 x_2 y_1 + (x_2 y_2)^2 \\ &=(x_1 y_2 - x_2 y_1)^2 \end{align*} which proves the CSB inequality in dimension two. This begs the question:

If $x = (x_1,\ldots,x_n),y=(y_1,\ldots,y_n) \in \mathbb{R}^n$, is there a polynomial $p \in \mathbb{R}[x_1,\ldots,x_n;y_1,\ldots,y_n]$ such that $ \|x\|^2 \|y\|^2 - \langle x,y \rangle^2 = p^2$?


No (it is not a square of a polynomial for $n \ge 3$), but the right generalization, proving that it is nonnegative, is that it is a sum of squares.

For instance, for $n = 3$, $$ \begin{align*} \|x\|^2 \|y\|^2 - \langle x,y \rangle^2 &= (x_1y_2 - x_2y_1)^2 + (x_2y_3-x_3y_2)^2 + (x_3y_1 - x_1y_3)^2, \end{align*} $$

and in general $$ \begin{align*} \|x\|^2 \|y\|^2 - \langle x,y \rangle^2 &= \sum_{i < j}(x_iy_{j} - x_{j}y_{i})^2 \end{align*} $$


This is easy to prove algebraically: the left hand side is

$$ \begin{align*} \|x\|^2 \|y\|^2 - \langle x,y \rangle^2 &= (\sum{x_i^2}\sum{y_j^2}) - (\sum{x_iy_i})^2 \\ &= \sum_{i=j}{x_i^2 y_j^2} + \sum_{i\neq j}{x_i^2y_j^2} - \sum_{i=j}{x_iy_ix_iy_i} - \sum_{i\neq j}{x_iy_ix_jy_j} \\ &= \sum_{i<j}{(x_i^2 y_j^2 + x_j^2 y_i^2)} - \sum_{i<j}{2x_iy_ix_jy_j} \\ &= \sum_{i<j}{(x_i^2 y_j^2 - 2x_iy_jx_jy_i + x_j^2y_i^2)} \\ &= \sum_{i < j}(x_iy_{j} - x_{j}y_{i})^2 \end{align*} $$

This identity is known as Lagrange's identity.


This also shows that the left hand size is zero when for all pairs $(i,j)$, we have $x_iy_j - x_jy_i = 0$, i.e., $$\frac{y_i}{x_i} = \frac{y_j}{x_j}$$ (let's assume the $x_j$s are nonzero, for now), which is another way of saying that one vector is a multiple of the other, i.e., equality holds in the inequality when the two vectors are parallel.


For the former (showing that it is not the square of a polynomial), consider for instance $n=3$. If $\|x\|^2 \|y\|^2 - \langle x,y \rangle^2$ is the square of a polynomial $p(x_1, x_2, x_3, y_1, y_2, y_3)$, then we can write the polynomial as $qx_1 + r$, where $q = q(x_2, x_3, y_1, y_2, y_3)$ and $r = r(x_2, x_3, y_1, y_2, y_3)$ are polynomials that don't depend on $x_1$. As $(qx_1+r)^2 = q^2x_1^2 + 2qrx_1 + r^2$, the coefficient of $x_1^2$ should be a square, but the coefficient is $y_1^2 + y_2^2 + y_3^2 - y_1^2 = y_2^2 + y_3^2$ (or in general, $\sum_{i=2}^{n}y_i^2$), which is not the square of a polynomial. (Proved similarly: if it is the square of $qy_2 + r$, then comparing coefficients of $y_2^2$ gives $q \equiv 1$, and comparing coefficents of $y_2$ gives $r \equiv 0$, which is not consistent with the rest.)


As Erick Wong points out in the comments, this is related to (the solution of) Hilbert's seventeenth problem, which says that any polynomial that takes only nonnegative values can be written as a sum of squares of rational functions. If we only care about representations as sum of squares of polynomials, any nonnegative polynomial can be approximated as closely as desired with a sum of squares of polynomials. See e.g. the book Positive Polynomials and Sums of Squares (preview).


$\sum x_i^2\sum y_i^2-(\sum x_iy_i)^2$

$=\sum x_i^2y_i^2+\sum_{i\neq j} x_i^2y_j^2-\sum x_i^2y_i^2-\sum_{i\neq j} x_iy_ix_jy_j$

$=\sum_{i<j} (x_iy_j-x_jy_i)^2$


When $n\ge3$, $\|x\|^2\|y\|^2 - \langle x,y\rangle^2$ is not the square of any polynomial $p(x,y)$. Keep all entries other than $x_1$ fixed and let \begin{align*} q(x_1) &= \|x\|^2\|y\|^2 - \langle x,y\rangle^2,\\ \Rightarrow q\,'(x_1) &= 2(x_1 \|y\|^2 - y_1\langle x,y\rangle) \end{align*} If $q$ is a squared polynomial, some zero of $q\,'$ must be a zero of $q$. However, when $x=(x_1,1,0,0,\ldots,0)$ and $y=(1,0,1,0,0,\ldots,0)$, we have $q\,'(x_1)=2x_1$ and $q(x_1)=2(x_1^2+1)-x_1^2$. So, the only zero of $q\,'$ is $x_1=0$, but $q(0)=2\neq0$. Therefore $q$ is not a squared polynomial.