Simultaneous orthogonalization in Arnoldi iteration
Recently, I was studying Stewart's A Krylov-Schur Algorithm for Large Eigenproblems. After a survey, the so called expansion phase is implemented, with the description given below:
For the sake of understanding the roles of the variables, let's take the following definition of Krylov-Schur decomposition:
A Krylov-Schur decomposition of order $k$ is a relation of the form $$A U_k = U_k S_k + u_{k+1}b_{k+1}^H$$
Where the $S$ stresses the triangularity of of the "Rayleigh quotient", $S_k$ (of order $k$). Moreover the columns of $(U_k \hspace{0.2cm} u_{k+1})$ are linearly independent.
I don't understand (In particular) who should the array $U$ be:
I think that can't be simply $U_{k+1} = (U_k \hspace{0.2cm} u_{k+1})$ because if it were, since the algorithm orthogonalize the columns of $U_k$ each time, $U_{k+1} = (U_k \hspace{0.2cm} u_{k+1})$ would be orthogonal, i.e $(U_k \hspace{0.2cm} u_{k+1})^H(U_k \hspace{0.2cm} u_{k+1}) = I_{k+1}$. This seems wrong for the following reason:
In $1.$ we're taking $v = Au_{k+1}$: this means that the rows of $U$ must be equal to the order of $A$ (which is a matrix of order $n$ fixed). Analogously in line two we should deduce that the columns of $U$ must be $n$ since $v$ is a $n \times 1$ vector.
Moreover, I tried to implement this pseudocode to analyze the initial steps, problem is that I have to know $U$ at a generic step $k$: I could try manually $k= 0$ for example but I should verify that $A U_0 = U_0 S_0 + u_{1}b_{1}^H$ if verified for some $U_0,S_0,u_1,b_1$ and I don't know how to choose (maybe it's better to start with $k=1$ as base case but the problem remains).
My arguments are correct? Any help or thought would be appreciated in order to disambiguate notation or wrong implications.
Edit : I think I saw the problem here: $U$ should be $U_{k+1}$ but since the $U_{k+1}$ is $n \times k+1$ even saying that this matrix is ortoghonal is meaningless. Any other ideas or point of views would be appreciated.
Answering the question in comments:
Okay so $k=1$ would be the base case: $U_1$ is the arbitrary unit-length column vector. But I'm unable to say who $S_1$ and $b^H_2$ are given the informations.
The algorithm's primary goal is to construct an orthogonal basis $u_1, u_2, \dots, u_k$ in Krylov subspace $\mathcal K_k$, which is defined as the span of $u_1, Au_1, \dots, A^{k-1} u_1$ vectors Let's run through the algorithm.
- $v_{k} = A u_{k}$. It's the vector that we would like to add to $\mathcal K_{k}$ to obtain $\mathcal K_{k+1}$
- $w_k = U_k^H v_k$. This step projects $v_k$ onto $\mathcal K_{k}$ and $w_k$ are projection coefficients of $v_k$ on each of the basis vectors $u_j, j=1,\dots,k$.
- $v_k' = v_k - U_kw_k$. The vector $v_k'$ is orthogonal to $\mathcal K_{k}$ (i.e. every $u_j, j=1,\dots,k$)
- $\nu_k = \|v_k'\|$. Nothing particularly interesting besides $\nu_k \in \mathbb R, \nu_k > 0$.
- $u_{k+1} = \frac{v_k'}{\nu_k} = \frac{v_k'}{\|v_k'\|}$. Just a unit vector, orthogonal to every of $u_j, j=1,\dots,k$. The following holds $$ A u_k = U_k w_k + u_{k+1} \nu_k $$
On the previous iterations we had $$ A U_{k-1} = U_{k-1} S_{k-1} + u_k b_k^H = \begin{pmatrix}U_{k-1} & u_k \end{pmatrix} \begin{pmatrix} S_{k-1}\\ b_k^H \end{pmatrix} = U_k \begin{pmatrix} S_{k-1}\\ b_k^H \end{pmatrix} $$ Combining with the $A u_k = U_k w_k + u_{k+1} \nu_k$ equation we have $$ A U_k = A\begin{pmatrix}U_{k-1} & u_k \end{pmatrix} = U_k \begin{pmatrix} S_{k-1} & (w_k)_{1:k-1}\\ b_k^H & (w_k)_{k} \end{pmatrix} + u_{k+1} \begin{pmatrix}0 & \nu_k\end{pmatrix} $$ So we have $$ S_k = \begin{pmatrix} S_{k-1} & (w_k)_{1:k-1}\\ b_k^H & (w_k)_{k} \end{pmatrix}, \qquad b_{k+1}^H = \begin{pmatrix}0 & \nu_k\end{pmatrix} $$
The vector $b_{k+1}$ has a single nonzero element at its end. This indicates that $S_k$ is not a triangular matrix, but an upper Hessenberg matrix instead. It has $\nu_k$ elements below its main diagonal. And $b_{k+1}^H$ is nothing but a part of $S_{k+1}$ matrix. In other words a Krylov-Schur decomposition can be written as $$ A U_k = U_{k+1} \begin{pmatrix} S_k\\ 0 \dots 0 \; \nu_k \end{pmatrix} $$ The last equation can be simplified if one defines a $k+1 \times k$ matrix $H_{k+1}$ as $$ H_{k+1} = \begin{pmatrix} S_k\\ 0 \dots 0 \; \nu_k \end{pmatrix}. $$ The Krylov-Schur decomposition becomples simply $AU_{k} = U_{k+1} H_{k+1}$. The $S_k$ matrix can be written as $\begin{pmatrix}H_{k} & w_k\end{pmatrix}$, so $$ H_{k+1} = \begin{pmatrix} H_k & w_k\\ 0 & \nu_k \end{pmatrix}. $$ Now it becomes clear that the algorithm you've cited uses $\hat S$ for $H_k$.
$H_1$ is an empty matrix and $H_2 = \begin{pmatrix} w_1\\ \nu_1 \end{pmatrix}$.