Intuitive explanation of why $\dim\operatorname{Im} T + \dim\operatorname{Ker} T = \dim V$

linear-algebra

I'm having a hard time truly understanding the meaning of $\dim\operatorname{Im} T + \dim\operatorname{Ker} T = \dim V$ where $V$ is the domain of a linear transformation $T:V\to W$. I've used this equation several times in many problems, and I've gone over the proof and I believe that I fully understand it, but I don't understand the intuitive reasoning behind it. I'd appreciate an intuitive explanation of it.

Just to be clear, I do understand the equation itself, I am able to use it, and I know how to prove it; my question is what is the meaning of this equation from a linear algebra perspective.

I like to think of it as some form of conservation of dimension. If you have a linear mapping then it acts on each dimension of the domain (this is a consequence of linear mappings being completely determined by their action on any given basis of a space).

There only two possibilities for each dimension, either it is preserved or it is compressed (i.e. taken to $\mathbf{0}$). The net dimension of the compressed portion of the domain is your nullity, i.e. the dimension of your kernel. The net dimension which is preserved is your rank, i.e. the dimension of your image space. This gives you an intuitive understanding of the rank-nullity theorem.

As a note, if you take a minute and think deeply then you'll realize this argument is essentially the same as the projections that trb456 mentioned.

You can think about Rank-Nullity Theorem geometrically in terms of things called fibers over points.

Think about the case when your mapping $f: U \to V$ is surjective, and consider the mapping $f^{-1}: V \to 2^U$ that takes each point $p \in V$ to it's preimage $f^{-1}(p)$ (called fiber over $p$) in $U$. You can easily check the fibers are affine subspaces of $U$ parallel to each other (each point on $U$ passes through exactly one fiber). Also, the fiber passing through $0 \in U$ is exactly $\ker f$.

You can thus picture $U$ as being separated into infinite number of thin layers, like a sedimentary rock:

enter image description here

From this you can easily see that to uniquely specify a point in $U$ you can first specify a fiber (the set of fibers being parameterized by $V = \operatorname{im} f$) and then specify a point on a fiber (that is (non-uniquely) parameterized by $\ker f$). This gives you the Rank-Nullity Theorem: $$\dim \ker f + \dim \operatorname{im} f = \dim U.$$

For example, in case of a mapping $f: \mathbb{R}^2 \to \mathbb{R},\; (x, y) \to x + y$, the fibers will satisfy an equation of the form $y = a - x$ for some $a \in \mathbb{R}$. You can check that here $y + x = a - x + x = a$ indeed does not depend on either $y$ or $x$.

Now, how many independent variables do you need to specify a point in $\mathbb{R}^2$? You need one variable ($a$) to specify a fiber (equivalently, a point on $\mathbb{R}$), and another one (say, $x$) to specify a point on the fiber - that's two degrees of freedom, as expected!

The Rank-Nullity theorem states that for any surjective linear mapping $f: U \to V$, in any dimension you can use the same trick to uniquely parameterize any point in $U$. The same goes for any non-surjective linear mapping, of course, you'll just need to corestrict it to its image.

Alternatively, you could draw another line through $0 \in \mathbb{R}^2$ distinct from $\ker f$. You can easily show that it crosses each fiber of $f$ exactly once, so you can use it to parameterize fibers more explicitly: identify this line with $\operatorname{im} f$, then for any two points on $\ker f$ and $\operatorname{im} f$ you can uniquely obtain the corresponding point of $\mathbb{R}$ using the parallelogram rule. Rank-Nullity states that you can do this sort of thing in any dimension and for any $f$ (instead of lines you'll have affine subspaces of different dimensions, though).

This is a geometric picture of what's going on.

Perhaps think of it in terms of projections? Whatever T does not project into the image must disappear; i.e. is in the kernel. This is why it is the domain dimension that matters. The image is an injection into the range, so it has the same dimension as the corresponding preimage in the domain. The image of the kernel is just zero, so it is the dimension of the kernel in the domain that matters.

Intuitive explanation of why $\dim\operatorname{Im} T + \dim\operatorname{Ker} T = \dim V$

Related

Recent Posts