Given a perturbation of a symmetric matrix, find an expansion for the eigenvalues

Let $A$ be a real, symmetrix $n\times n$ matrix with $n$ distinct, non-zero eigenvalues, and let $V$ be a real, symmetric $n\times n$ matrix.

Consider $A_{\varepsilon}=A+\varepsilon V$, a perturbation of $A$, where $\varepsilon$ is a small number.

  1. Find an expansion as an $\varepsilon$-series for the eigenvalues and eigenvectors of $A_{\varepsilon}$ around the eigenvalues and eigenvectors of the original matrix, $A$.
  2. Assume that $\epsilon$ is sufficiently small that one can neglect all terms of order $\varepsilon^2$ and higher in the expansion. How close are the eigenvalues of $A_{\varepsilon}$ to those of $A$? What about the eigenvectors?
  3. What can we say about the eigenvalues and eigenvectors of $A_{\varepsilon} $ if $A$ and $V$ are arbitrary $n\times n$ matrices? You may assume that $\det A \neq 0$.

Ideas: If $A$ and $V$ commute (thus are simultaneously diagonalizable), there are just the linear term $\varepsilon v_{ii}$, when we look with the right basis. I'm getting the feeling I really don't know what they want me to do with this problem. Any ideas?


$\newcommand{\+}{^{\dagger}}% \newcommand{\angles}[1]{\left\langle #1 \right\rangle}% \newcommand{\braces}[1]{\left\lbrace #1 \right\rbrace}% \newcommand{\bracks}[1]{\left\lbrack #1 \right\rbrack}% \newcommand{\ceil}[1]{\,\left\lceil #1 \right\rceil\,}% \newcommand{\dd}{{\rm d}}% \newcommand{\ds}[1]{\displaystyle{#1}}% \newcommand{\equalby}[1]{{#1 \atop {= \atop \vphantom{\huge A}}}}% \newcommand{\expo}[1]{\,{\rm e}^{#1}\,}% \newcommand{\fermi}{\,{\rm f}}% \newcommand{\floor}[1]{\,\left\lfloor #1 \right\rfloor\,}% \newcommand{\half}{{1 \over 2}}% \newcommand{\ic}{{\rm i}}% \newcommand{\iff}{\Longleftrightarrow} \newcommand{\imp}{\Longrightarrow}% \newcommand{\isdiv}{\,\left.\right\vert\,}% \newcommand{\ket}[1]{\left\vert #1\right\rangle}% \newcommand{\ol}[1]{\overline{#1}}% \newcommand{\pars}[1]{\left( #1 \right)}% \newcommand{\partiald}[3][]{\frac{\partial^{#1} #2}{\partial #3^{#1}}} \newcommand{\pp}{{\cal P}}% \newcommand{\root}[2][]{\,\sqrt[#1]{\,#2\,}\,}% \newcommand{\sech}{\,{\rm sech}}% \newcommand{\sgn}{\,{\rm sgn}}% \newcommand{\totald}[3][]{\frac{{\rm d}^{#1} #2}{{\rm d} #3^{#1}}} \newcommand{\ul}[1]{\underline{#1}}% \newcommand{\verts}[1]{\left\vert\, #1 \,\right\vert}$ Let's consider a particular eigenvector $v_{0}$ of $A$ where $\lambda_{0}$ is the correspondent eigenvalue. Namely, $Av_{0} = \lambda_{0}v_{0}$. 'Under the action' of $A + \epsilon V$, the eigenvalue problem becomes $\pars{A + \epsilon V}v = \lambda v$. We expand $v$ and $\lambda$ in powers of $\epsilon$ as: $$ v = v_{0} + v_{1}\epsilon + v_{2}\epsilon^{2} + \cdots\,, \qquad \lambda = \lambda_{0} + \lambda_{1}\epsilon + \lambda_{2}\epsilon^{2} + \cdots $$ In order to derive the contributions to the eigenvector and eigenvalue $\ul{\mbox{up to order}\ \epsilon}$, it's sufficient to write: $$ \pars{A + \epsilon V}\pars{v_{0} + \epsilon v_{1}} = \pars{\lambda_{0} + \lambda_{1}\epsilon}\pars{v_{0} + \epsilon v_{1}}\,, \quad \left\vert% \begin{array}{rcl} Av_{0} & = & \lambda_{0}v_{0} \\ Av_{1} + Vv_{0} & = & \lambda_{0}v_{1} + \lambda_{1}v_{0} \end{array}\right. $$ where we equated terms which correspond to the same power of $\epsilon$ $\pars{~\ul{\mbox{up to order}\ \epsilon}~}$.

Applying $v_{0}^{\sf T}$, by the left, to both sides of the last expression: $$ v_{0}^{\sf T}Av_{1} + v_{0}^{\sf T}Vv_{0} = \lambda_{0}v_{0}^{\sf T}v_{1} + \lambda_{1}v_{0}^{\sf T}v_{0} $$ Since $A$ is a symmetric matrix, we'll have $$ \color{#ff0000}{v_{0}^{\sf T}Av_{1}} = \pars{v_{1}^{\sf T}Av_{0}}^{\sf T} = \pars{v_{1}^{\sf T}\lambda_{0}v_{0}}^{\sf T} = \color{#ff0000}{\lambda_{0}v_{0}^{\sf T}v_{1}} \quad\mbox{such that}\quad \lambda_{1} = {v_{0}^{\sf T}Vv_{0} \over v_{0}^{\sf T}v_{0}} $$

Then, $\pars{~\ul{\mbox{up to order}\ \epsilon}~}$: $$\color{#0000ff}{\large% \lambda \approx \lambda_{0} + {v_{0}^{\sf T}Vv_{0} \over v_{0}^{\sf T}v_{0}}\,\epsilon} $$

This is a good complex analysis problem, even though everything is stated in terms of symmetric real matrices. Complex analysis gives you a good way to find objects by using contour integrals. For example, the matrix $A$ has an orthornormal basis $\{ e_{j}\}_{j=1}^{n}$ with corresponding unique eigenvalues $\sigma(A)=\{ \lambda_{j}\}$, which can be used to write the resolvent matrix $$ (\lambda I -A)^{-1}x = \sum_{j=1}^{n}\frac{1}{\lambda-\lambda_{j}}(x,e_{j})e_{j}. $$ The resolvent has simple poles at all of the eigenvalues, and the residue matrix $R_{j}$ is the orthogonal projection onto the eigenspace with eigenvalue $\lambda_{j}$: $$ R_{j}x = (x,e_{j})e_{j}. $$ Notice that, working in the norm of $\mathbb{C}^{n}$, $$ \|(\lambda I -A)^{-1}x\|^{2}\le \frac{1}{\mbox{dist}(\lambda,\sigma(A))^{2}}\sum_{j=1}^{n}\|x\|^{2}. $$ That is, $$ \|(\lambda I -A)^{-1}\| \le 1/\mbox{dist}(\lambda,\sigma(A)). $$ If you perturb $A$ by $\epsilon V$, where $V$ is also symmetric and real, and where $\epsilon$ is real, then $(\lambda I -A-\epsilon V)$ remains invertible provided $\epsilon$ is small enough. In particular, $$ (\lambda I - A -\epsilon V)^{-1}=(\lambda I -A)^{-1}(I-\epsilon V(\lambda I-A)^{-1})^{-1} = (\lambda I -A)^{-1}\sum_{k=0}^{\infty}\epsilon^{k}(V(\lambda I -A)^{-1})^{k}. $$ The above converges in matrix operator norm to the expected inverse if $$ |\epsilon|\|V\|/\mbox{dist}(\lambda,\sigma(A)) < 1\;\;\;\mbox{ or }\;\;\; |\epsilon| < \|V\|^{-1}\mbox{dist}(\lambda,\sigma(A)). $$ By drawing a fixed system of circular contours $C_{j}$ around each $\lambda_{j}$, you can make $\epsilon$ small enough to guarantee that $(\lambda I - A-\epsilon V)^{-1}$ has all of its singularities inside these contours $C_{j}$, and to guarantee that the integrals of $(\lambda I -A-\epsilon V)^{-1}$ vary continuously with $\epsilon$ in a small range of $0$. So the eigenvalues of $A+\epsilon V$ remain in these circles for small $\epsilon$. Because the eigenspaces of $A$ all have dimension 1, then it follows that the same is true of $A+\epsilon V$ for small enough $\epsilon$; if some of the eigenspaces were to have dimension 2 or more, then there could be bifurcations in the eigenvalues.

You can argue that the dimensions of eigenspaces remain 1 for small $\epsilon$ by applying the trace to the orthogonal projection matrices $P_{j}^{\epsilon}$ defined through the contour integrals--the trace is a value which is 1 at $\epsilon=0$, remains an integer, and which must vary continuously for small $\epsilon$ because the following varies continuously with $\epsilon$: $$ P_{j}^{\epsilon} =\frac{1}{2\pi i}\oint_{C_{j}}(\lambda I - A-\epsilon V)^{-1}d\lambda . $$ Note: The $P_{j}^{\epsilon}$ is the projection onto all of the eigenvectors of $A+\epsilon V$ corresponding to eigenvalues enclosed by $C_{j}$. Also, $P_{j}^{\epsilon}e_{j}$ is non-zero for $\epsilon$ near $0$, and must be an eigenvector of $A+\epsilon V$ with the unique eigenvalue enclosed by $C_{j}$. I'm not sure what the normalization does to eigenvector function $\epsilon\mapsto P_{j}^{\epsilon}e_{j}$, however.

EXPANSION: I will add the explict expansion offered by the above. The expansion for the resolvent $(\lambda I - A-\epsilon V)^{-1}$, which is valid for $|\epsilon|\|V\| < \mbox{dist}(\lambda,\sigma(A))$ has first order $\epsilon$ term $$ (\lambda I - A)^{-1} + \epsilon(\lambda I - A)^{-1}V(\lambda I - A)^{-1} $$ When applied to a vector $x$, this can be written as $$ (\lambda I - A)^{-1}x + \epsilon\sum_{k=1}^{n}\sum_{j=1}^{n} \frac{(x,e_{j})(Ve_{j},e_{k})e_{k}}{(\lambda-\lambda_{j})(\lambda-\lambda_{k})} $$ Integrating over a circle enclosing only $\lambda_{j}$, which now encloses also $\lambda_{j}'$, the perturbed eigenvalue, only $(\lambda-\lambda_{j})^{-1}$ is singular. The term $(\lambda-\lambda_{j})^{-2}$ integrates to 0 because the residue is 0. The other terms $(\lambda-\lambda_{k})^{-1}(\lambda-\lambda_{j})^{-1}$ integrate to $(\lambda_{j}-\lambda_{k})^{-1}$ because $(\lambda-\lambda_{k})$ is holomorphic in the circle around $\lambda_{j}$. The projection $P_{j}x=(x,e_{j})e_{j}$ is the integral of $(\lambda I-A)^{-1}$ over the circle around $\lambda_j$. Finally, one obtains $$ P_{j}^{\epsilon}x = (x,e_{j})e_{j} +\epsilon\left(\sum_{k=1, (k\ne j)}^{n}\frac{(x,e_{j})(Ve_{j},e_{k})}{(\lambda_j-\lambda_k)}e_{k}\right) $$ A good expansion of the perturbed eigenvector is, therefore, $P_j^{\epsilon} e_j$, which is $$ e_{j}' = e_{j} +\epsilon \sum_{k=1\;(k\ne j)}^{k} \frac{(Ve_j,e_k)}{\lambda_j-\lambda_k}e_k. $$ This needs to be normalized in length. Afterwards, the eigenvalue is $\lambda_j' = ((A+\epsilon V)e_{j}',e_{j}')$.