An eigenvector is a non-zero vector such that...

Various sources define eigenvalues and eigenvectors in slightly different ways (context independent). For example, both of the following definitions seem not to exclude the zero-vector as an eigenvector. In Lang:

Let $V$ be a vector space over a field $K$, and let $A:V\to V$ be an operator on $V$. An element $\mathbf{v}\in V$ is called an eigenvector of $A$ if there exists $\lambda\in K$ such that $A\mathbf{v}=\lambda\mathbf{v}$. If $\mathbf{v}\ne \mathbf{0}$, then...

and, in Hoffman/Kunze

Let $V$ be a vector space over a field $F$ and let $T$ be a linear operator on $V$. A characteristic value of $T$ is a scalar $c$ in $F$ such that there is a non-zero vector $\alpha$ in $V$ with $T\alpha =c\alpha$. If $c$ is a characteristic value of $T$, then (a) any $\alpha$ such that $T\alpha = c\alpha$ is called a characteristic vector of $T$ associated with the characteristic value $c$, and (b) the collection of all $\alpha$ such that $T\alpha = c\alpha$ is called the characteristic space associated with $c$.

More typically (?), you would see explicit exclusion of the zero-vector as,

An eigenvector is a non-zero vector $x$ such that $A\mathbf{x}=\lambda\mathbf{x}$ for some scalar $\lambda$. The scalar $\lambda$ is called an eigenvalue and $x$ an associated eigenvector. The eigenspace corresponding to the eigenvalue $\lambda$ is the set of all associated eigenvectors along with the zero-vector.

I realize that the zero-vector must be explicitly excluded when defining an eigenvalue, but once that definition is made, explicitly excluding the zero-vector as being an eigenvector and then explicitly including it again to form the eigenspace seems rather artificial. Is this simply a matter of attempting to make the definitions more natural depending on their order? That is,

eigenvalue (excluding the zero-vector), then eigenvector -> zero vector ok
eigenvector, then eigenvalue -> zero vector not ok

or, is there anything incorrect about the definition given, for example, in Hoffman/Kunze above? If so, what sort of inconsistency would result from the definition. I'm not an algebraist, so something rather elementary (if possible) would be preferred.

Solution 1:

The zero vector will always treat any linear operator on its space the same way it treats scalar multiplication, but that isn't very interesting.

Put another way, suppose we are given any space $V$ over a field $F$. Then for any linear operator $T:V\to V$ and any scalar $\lambda\in F$, we define a linear transformation $S_{T,\lambda}:V\to V$ by $$S_{T,\lambda}(v)=T(v)-\lambda v.$$ We can confirm that this is always a linear operator. The kernel of $S_{T,\lambda}$ is always a subspace of $V$, and in particular clearly contains the zero vector. When such a kernel is non-trivial (has more than just the zero vector in it), we call it the eigenspace of $T$ corresponding to $\lambda$, we call $\lambda$ an eigenvalue of $T,$ and we call the kernel's elements the eigenvectors of $T$ corresponding to $\lambda$.

It's really only the non-trivial kernels of $S_{T,\lambda}$ (if any) that say anything "interesting" about $T$, so only $\lambda$ corresponding to such non-trivial kernels "deserve" the title of eigenvalue of $T$, and only members of such nontrivial kernels "deserve" the title of eigenvector. In that sense, it isn't really a problem to say that the zero vector is an eigenvector of $T$, provided that there is some $\lambda$ such that the kernel of $S_{T,\lambda}$ is non-trivial. (There may be no such $\lambda$. Consider any non-trivial rotation of the plane about the origin. This has no eigenvalues or eigenvectors.)

Solution 2:

As the question arose again recently (Can the zero vector be an eigenvector for a matrix?, Why an eigenspace is a linear subspace, if the zero vector is not an eigenvector?), let me try to list some situations where it could be awkward or to the contrary convenient to allow zero to be called an eigenvector (possibly qualified by "for $\lambda$", where $\lambda$ is any scalar), while of course taking care to not let this change the notion of eigenvalue, or of the eigenspace for an eigenvalue. (But allowing zero to be eigenvector makes it somewhat more natural to also talk about eigenspaces for non-eigenvalues, which then of course are the zero subspace.)

Situations where calling $0$ an eigenvector is (somewhat) awkward, in requiring additional "nonzero":

The definition of eigenvalue for $T$ must mention "nonzero eigenvectors" (or be in terms of other things altogether, like the dimension of $\ker(T-\lambda I)$).
Given an eigenvector, talking about "the associated eigenvalue" requires excluding the zero vector case.

Situations where calling $0$ an eigenvector is (somewhat) convenient, allowing for instance to avoid mentioning "element of $\ker(T-\lambda I)$" or "eigenvector for $\lambda$ or zero":

If (and only if) $T$ is diagonalisable with distinct eigenvalues $\lambda_1\ldots,\lambda_k$, every vector can be (uniquely) written as the sum of certain eigenvectors for $\lambda_1\ldots,\lambda_k$.
If for a polynomial $P=(X-a_1)\ldots(X-a_n)$, with $a_1,\ldots,a_n$ all distinct, one has $P[T]=0$, then $V$ is the (direct) sum of the eigenspaces of $T$ for $a_1,\ldots,a_n$. [The main point here is in fact being allowed to talk of eigenspaces for $a_i$ even if $a_i$ is no eigenvalue, has only the zero eigenvector.]

This is what comes to my mind right now; I will take suggestions for extending either list (or come back to extend them when I come across a typical situation).

Let $A, B$ be sets. Show that $\mathcal P(A ∩ B) = \mathcal P(A) ∩ \mathcal P(B)$. [duplicate]

How can know if a proof technique can actually prove something? Specifically, induction

Evaluating the sum $\sum_{n=1}^{\infty}\dfrac{(-1)^{n}}{n^{2}}$

Show that if $U$ is an open connected subspace of $\mathbb{R}^2$, then $U$ is path connected

Convert $\frac{d^2y}{dx^2}+x^2y=0$ to Bessel equivalent and show that its solution is $\sqrt x(AJ_{1/4}+BJ_{-1/4})$

Why are the fundamental and anti-fundamental representation in $\text{SL}(2,\mathbb{C})$ not equivalent?

Why are there no discrete zero sets of a polynomial in two complex variables?

Question about enumerability [closed]

How can I show that this stochastic process satisfies the heat equation?

Is $(\mathbb{E}[[X-\mathbb{E}[X]]_+^p])^{1/p}$ convex?

Krull dimension of complement of an open subset containing all generic points

Subset of $\ell^2$ which is closed and bounded, but not compact [closed]