Minimization of positive quadratic function using gradient descent in at most $ n $ steps
For minimization positive quadratic form $$f = \frac{1}{2}\left\langle Ax,x \right\rangle - \left\langle b,x\right\rangle \rightarrow \min_{x\in\mathbb{R}^n},$$ we use gradient descent $$x^{k+1} = x^{k} - \alpha_k \nabla f(x^k)$$ with step $\alpha_k = \frac{1}{\lambda_{k+1}}$ , where $\lambda_{k+1}$ is an eigenvector of $A$ $(0 < \mu = \lambda_1 \leqslant \cdots \leqslant \lambda_n = L)$. I need to prove that $x^n = x^*$, where $Ax^* = b$.
I have an idea, that I need to go to basis, where $A$ becomes diagonal, $A = P^T \Lambda P$, where $\Lambda = \text{diag} \{\lambda_1, \ldots, \lambda_n \}$ and $P$ consists of eigenvectors. I tried to express $x^* - x^0$ and $x^n - x^0$ as a linear combination of basis vectors to compare them, but didn't succeed. Could you please help me with that proof?
Solution 1:
You can assume that you have an orthonormal basis of eigenvectors so that $A$ is diagonal, with the eigenvalues $\lambda_k$ on the diagonal. Let me use $\Lambda$ for $A$ in this basis, just for emphasis.
Note that the solution is given by $x^*= \Lambda^{-1} b$.
Then $x_{k+1} = x_k -{1 \over \lambda_{k+1}} (\Lambda x_k -b) = (I-{1 \over \lambda_{k+1}} \Lambda) x_k + {1 \over \lambda_{k+1}} b$, for $k=0,...,n-1$.
Note that the $k+1$th entry of $x_{k+1}$ satisfies $[x_{k+1}]_{k+1} = 0 + [x^*]_{k+1}$, and if $[x_k]_i = [x^*]_i$ for $i=0,...,k$, then $[x_{k+1}]_i = [x^*]_i$ for $i=0,...,k$.
Hence $x_n = x^*$.
Solution 2:
I haven't checked this but this is the sketch of the proof I think you should do:
- Build the solution using the matrix eigen vectors.
- Show that in each step you update only single coordinate which matches the solution of 1 while leaving others as they are.
- Conclude that after $ n $ steps there are no direction to minimize along hence the point is optimal.
I think you should state that $ A $ is Positive (Semi) Definite Matrix for that.