Optimization problem (in linear algebra course!)

Let $a_1, a_2, \ldots, a_n$ be real numbers such that $a_1 + \cdots + a_n = 0$ and $a_1^2 + \cdots +a_n^2 = 1$. What is the maximum value of $a_1a_2 + a_2a_3 + \cdots + a_{n - 1}a_n + a_na_1$?

I'd like to emphasize that this was found in a random linear algebra exercise sheet so one might expect that there exists a clever solution based on matrix manipulation...

Personally I tried to conceive the problem geometrically and analytically. Notably $n=1$ has no meaning, $n=2$ gives $-1$, $n=3$ gives $-1/2$. This did not reveal much about the general case except for the fact that Lagrangian (system of partial derivatives and all that jazz) seems to imply that ANY combination satisfying the constraints gives the same value (value I'm trying to maximize) - but this needs some further checking.

Back to the linear algebra I see traces of matrices, but I need to somewhat simply express $A$ and $B$ (see below) in terms of one another before anything useful can be done...

$$A = \operatorname{diag}(a_1,a_2,\ldots,a_{n-1},a_n);$$ $$B = \operatorname{diag}(a_2,a_3,\ldots,a_{n-1},a_n,a_1);$$

P.S. The problem was originally taken from this very forum but it's quite and old post and I don't seem to be able to leave a comment there.


EDIT: So here we go with a completely rewrite in the language of linear algebra.

In $V=\mathbb C^n$ with standard scalar product $\langle e_j,e_k\rangle=\delta_{jk}$ consider the linear map $$\begin{matrix}\phi\colon &V&\longrightarrow &V\\ &(x_1,x_2,\ldots,x_{n-1},x_n)&\longmapsto&(x_2,x_3,\ldots,x_n,x_1), \end{matrix}$$ that is $e_1\mapsto e_n$ and $e_k\mapsto e_{k-1}$ for $1<k\le n$. The problem statement asks us to maximize $\langle \phi a,a\rangle$ subject to the conditions

$$\begin{align}\tag{c1}\langle a,v_n\rangle &= 0,\\ \tag{c2}\langle a,a\rangle&=1,&\text{and}\\ \tag{c3}a&\in \mathbb R^n.\end{align}$$

Let $\zeta\in\mathbb C$ be a primitive $n$th root of unity, for example $\zeta=e^{\frac{2\pi}n}=\cos\frac{2\pi}{n}+i\sin\frac{2\pi}{n}$. For $0\le k<n$ let $$v_k=\sum_{\nu=1}^n \zeta^{\nu k}e_\nu=(\zeta^k,\zeta^{2k},\ldots ,\zeta^{nk}).$$ Then $$\langle v_k,v_k\rangle = n,\qquad\langle v_k,v_j\rangle=0\quad\text{for }j\ne k,\qquad \phi(v_k)=\zeta^kv_k,$$i.e. the $v_k$ are an orthogonal eigenvector basis of $V$. Thus if $$\tag1a=\sum_{k=1}^n c_kv_k$$ with $c_k\in\mathbb C$, then $$\langle a,a\rangle = n\sum_{k=1}^n|c_k|^2\qquad\text{and}\qquad\langle \phi a,a\rangle = n\sum_{k=1}^n \zeta^k|c_k|^2.$$

Condition (c$3$) implies that especially $\langle \phi a,a\rangle \in\mathbb R$ and condition (c$1$) is simply that $c_n=0$ in $(1)$. From this we obtain the bound $$\begin{align}\langle \phi a,a\rangle& =n\sum_{k=1}^{n} \zeta^k|c_k|^2\\& =n\sum_{k=1}^{n-1} \zeta^k|c_k|^2\\&=n\Re\left(\sum_{k=1}^{n-1} \zeta^k|c_k|^2\right)\\&=n\sum_{k=1}^{n-1} \Re(\zeta^k)|c_k|^2\\\tag2&\le \max\bigl\{\Re(\xi)\mid \xi^n=1,\xi\ne1\bigr\}\cdot n\sum_{k=1}^{n-1}|c_k|^2\\&=\cos\frac{2\pi}{n}\cdot\langle a,a\rangle.\end{align}.$$ Note that the special choice $a=\frac 1{\sqrt n} (v_1+v_{n-1})=\frac 1{\sqrt n} (v_1+\overline{v_1})$ yields $a\in \mathbb R^n$, $\langle a,a\rangle=1$ and equality in $(2)$. Therefore $$\max\bigl\{\langle \phi a,a\rangle\mid a\in\mathbb R^n,\langle a,a\rangle=1,\langle a,v_n\rangle=0\bigr\}= \cos\frac{2\pi}{n} .$$ Note that for $n=2$ and $n=3$, we recover the results $\cos\pi=-1$ and $\cos\frac{2\pi}3=-\frac12$, confirming what you found.


The problem can be solved in terms of linear algebra.

There is an orthonormal basis $(e_0,e_1,\ldots,e_{n-1})$ of $\Bbb R^n$ that diagonalizes the quadratic form $$f(x):=\sum_i x_i x_{i+1}\ .$$ When $(\xi_0,\ldots,\xi_{n-1})$ are the corresponding cordinates the form $f$ appears as $$\tilde f(\xi)=\sum_{i=0}^{n-1}\lambda_i \xi_i^2\ .$$ The $e_i$ are eigenvectors of the circulant matrix $$A:={1\over2}\left[\matrix{ 0&1&0&0&0&1 \cr 1&0&1&0&0&0 \cr 0&1&0&1&0&0 \cr 0&0&1&0&1&0 \cr 0&0&0&1&0&1 \cr 1&0&0&0&1&0 \cr}\right]\ ,$$ drawn here for the case $n=6$, and the $\lambda_i$ are the corresponding (real) eigenvalues. Of course $e_0={1\over\sqrt{n}}(1,1,\ldots,1)$. As all $e_i$ with $i\geq1$ are orthogonal to $e_0$ these $e_i$ automatically satisfy the linear constraint "$\sum_k a_k=0$". It follows that $$\max\bigl\{f(a)\ \bigm|\ {\rm ``constraints''}\bigr\}=\max\left\{\sum_{i=1}^{n-1}\lambda_i \xi_i^2\ \biggm|\ \sum_{i=1}^{n-1} \xi_i^2=1\right\}=\max_{1\leq i\leq n-1}\lambda_i\ .$$ To compute the $\lambda_i$ note that the $n$ complex vectors $$c_j:=(1,\omega^j,\omega^{2j},\ldots,\omega^{(n-1)j})\qquad(0\leq j\leq n-1)\ ,$$ where $\omega:=e^{2\pi i/n}$, are linearly independent eigenvectors of $A$. The eigenvalues are now immediate; the largest apart from $\lambda_0=1$ is $\alpha:=\cos{2\pi\over n}$. This value $\alpha$ is the solution to our problem; it is $>0$ for all $n\geq5$.


EDIT: Here is an answer that uses a minimum amount of complex arithmetic.

Consider $$ H := \frac{1}{2} \begin{bmatrix} 0 & 1 & & & & & 1 \\ 1 & 0 & 1 & & & & \\ & 1 & 0 & 1 & & & \\ & & \ddots & \ddots & \ddots & & \\ & & & 1 & 0 & 1 & \\ & & & & 1 & 0 & 1 \\ 1 & & & & & 1 & 0 \end{bmatrix}. $$ Then the problem may be rewritten $$ \max_v \ v^T H v \quad \text{subject to} \ e^T v = 0, \ \|v\|_2 = 1, $$ where $e = (1, 1, \ldots, 1)$. The problem amounts to finding the largest eigenvalue of $H$ and one of its associated eigenvectors in the subspace $e^T v = 0$. It's easy to check that when $n > 2$, $v_1 := \frac{1}{\sqrt{n}} (1, 1, \ldots, 1)$ is always an eigenvector of $H$ associated to the eigenvalue $1$.

But $H$ is a circulant matrix and it is real and symmetric. When $n=2$, $H$ is $\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}$ whose largest eigenvalue is $+1$. For any $n > 2$, properties of circulant matrices imply that the largest eigenvalue of $H$ is always $1$. So now the problem amounts to finding the second largest eigenvalue of $H$ and an associated eigenvector.

According to Gray the eigenvalues of a circulant matrix are the discrete Fourier transform (DFT) of its first row. The standard notation for the first row of a circulant matrix is $$ \begin{bmatrix} c_0 & c_{n-1} & c_{n-2} & \cdots & c_1 \end{bmatrix}. $$ Here $c_1 = c_{n-1} = 1/2$ and all others are zero. The DFT of this sequence is given by $$ \lambda_k = \sum_{j=0}^{n-1} c_j \, e^{-i j k 2\pi / n} = \tfrac{1}{2} 2 \cos(2 k \pi / n) = \cos(2 k \pi / n), \quad k = 0, \ldots, n-1, $$ i.e., a real sequence (thankfully because $H$ is symmetric!) The largest eigenvalue, obtained for $k=0$, is indeed $1$. The second largest is $\cos(2\pi/n)$.


Another partial answer. I think the key is to think geometrically about this. Your constraints describe the intersection of the unit sphere with the vector subspace orthogonal to $(1, 1, \ldots, 1)$.

For $n=2$, the problem is $$ \min \ -2xy \quad \text{subject to} \ x+y=0, \ x^2 + y^2 = 1. $$ Introducing Lagrange multipliers, the first-order optimality conditions can be written \begin{align*} -2y - \lambda_1 - 2 x \lambda_2 & = 0, \\ -2x - \lambda_1 - 2 y \lambda_2 & = 0, \\ x + y & = 0, \\ x^2 + y^2 & = 1. \end{align*} Adding the first two equations together and using $x+y=0$, we find $\lambda_1 = 0$. This means that the objective function naturally takes its maximum value on the plane $x+y=0$, so this constraint could be removed without changing the solution. Now we see that $\lambda_2 = 1$ and $x=-y=\pm 1 / \sqrt{2}$. The optimal objective value is $-1$ (or $+1$ if we return to the maximization problem).

For $n = 3$, in the same way as above, we find $\lambda_1 = 0$ and $\lambda_2 = 1/2$. So as you say, any $(x,y,z)$ satisfying the constraints gives the same objective value. As you say, this value is $-1/2$.

I guess the point of the exercise is to show that the objective is constant along certain ellipses.

For general $n$, the optimality conditions are \begin{align*} -a_2 - a_n - \lambda_1 - 2 a_1 \lambda_2 & = 0, \\ -a_{i+1} - a_{i-1} - \lambda_1 - 2 a_i \lambda_2 & = 0, \\ -a_{n-1} - a_1 - \lambda_1 - 2 a_n \lambda_2 & = 0, \\ \sum_{i=1}^n a_i & = 0, \\ \sum_{i=1}^n a_i^2 & = 1. \end{align*} Adding together the first three sets of equations, we have again $\lambda_1 = 0$. After removing $\lambda_1$ from the first three sets of equations, squaring them and adding them together, I get $$ \lambda_2 = \pm \frac{1}{\sqrt{2}} \sqrt{1 - f(a)}, $$ where $f(a) = -\sum_{i=1}^{n-1} a_i a_{i+1} - a_n a_1$ is the objective being minimized. I'll be happy to complete my answer when I discover more but this may already put you on track.


A partial solution, but needs more math than fits in a comment :-(

From an idea in Lohwater's "Inequalities":

We have, applying the AM-GM inequality ad nauseam: $$ \begin{align*} a_1^2 + a_2^2 &\ge 2 a_1 a_2 \\ a_2^2 + a_3^2 &\ge 2 a_2 a_3 \\ \vdots\\ a_{n - 1}^2 + a_n^2 &\ge 2 a_{n - 1} a_n \\ a_n^2 + a_1^2 &\ge 2 a_n a_1 \end{align*} $$ Adding up the whole mess and dividing by 2: $$ a_1^2 + a_2^2 + \ldots + a_n^2 \ge a_1 a_2 + a_2 a_3 + \ldots + a_n a_1 $$ This proves that the sum is at most 1.

Another idea to get a tighter bound: Let $n = 2 k$, where $k \ge 2$, and take $a_i = \frac{1}{\sqrt{n}}$ if $1 \le i \le k$, $a_i = - \frac{1}{\sqrt{n}}$ if $k < i \le k$. Then the sums: $$ \begin{align*} a_1 a_2 + \ldots + a_{k - 1} a_k &= \frac{k - 1}{n} \\ a_{k + 1} a_{k + 2} + \ldots + a_{2 k - 1} a_{2 k} &= \frac{k - 1}{n} \\ a_{2 k} a_1 = a_k a_{k + 1} &= - \frac{1}{n} \end{align*} $$ Adding up gives $a_1 a_2 + \ldots a_n a_1 = \frac{n - 4}{n}$.

The same idea works as long as $n \ge 4$; consider a stretch of $k$ positives and a stretch of $n - k$ negatives, giving two steps changing signs wrapping around. The stretches give $(k - 1) + (n - k - 1) = n - 2$ pairs that multiply to $\frac{1}{n}$ and two products of $- \frac{1}{n}$ for the steps, for a total of $\frac{n - 4}{n}$.