Intuition & Proof of rank(AB) $\le$ min{rank(A), rank(B)} (without inverses or maps) [Poole P217 3.6.59, 60]

I'm aware of analogous threads; I hope that mine is specific enough not to be esteemed one.

$\mathbf{a^i}$ is a row vector. $A, B$ are matrices. Prove: $1$. $\mathbf{a^i}B$ is a linear combination of the rows of $B$.
$2.$ Row space of $AB \subseteq$ row space of $B$. $\qquad$ $3.$ Column space of $AB \subseteq$ Column space of $A$.
$4.$ If $\mathbf{a_i}$ is a column vector, then $A\mathbf{a_i}$ is a linear combination of the columns of $A$.
$5. \operatorname{rank}(A\color{#B8860B}{B}) \color{#B8860B}{\le} \operatorname{rank}\color{#B8860B}{B} \qquad \qquad$ $6.\operatorname{rank}(AB) \leq \operatorname{rank} A$.
In general, $x \leq a \text{ & } x \le b \implies x \le \min\{a, b\}$.
So by $5 \, \& \, 6$, $\operatorname{rank}(AB) \leq \min\{\operatorname{rank}A,\operatorname{rank} B\}$.

$\bbox[2px,border:2px solid grey]{\text{ Proof of #5 :}} \;$ The rank of a matrix is the dimension of its row space. Need to show :
If $\operatorname{rowsp}(AB) \subseteq\operatorname{rowsp}(B)$, then $\operatorname{dim rowspace}(AB) \le \operatorname{dim rowspace}(B). $
Pick a basis for $\operatorname{rowsp}(AB)$. Say there are $p$ vectors in this basis.
By $\#2$, row space of $AB \subseteq$ row space of $B$, $\color{green}{\text{so all of these $p$ vectors also $\in \operatorname{rowsp}(B)$}}$. Moreover, they must be linearly independent (hereafter dubbed l-ind).
${\Large{\color{red}{[}}} \;$ Since the dimension of a space $=$ the maximum number of l-ind vectors in that space, $\; {\Large{{\color{red}{]}}}}$
and $\color{green}{\text{$\operatorname{rowsp}(B)$ has $\ge p$ l-ind vectors}}$, thus $ \operatorname{dim rowspace}(B) \; \ge \; \operatorname{dim rowspace}(AB) = p. $

$\bbox[2px,border:2px solid grey]{\text{ Proof of #6 :}} \;$ Apply $ \operatorname{rank}M = \operatorname{rank}M^T$ and $\#5$: $ \operatorname{rank}(AB)^T = \operatorname{rank}(B^T\color{#B8860B}{A^T}) \quad \color{#B8860B}{\le} \quad \operatorname{rank}\color{#B8860B}{A^T} = \operatorname{rank}(A)$.

$Q1.$ Please elucidate the above proof of $5$? I'm bewildered. What's the strategy?

$Q2.$ On P209, Poole defines dimension as the number of vectors in a basis.
So shouldn't the red bracket refer to a basis? If so, why doesn't the proof simply declare:
By $2$, the basis for $\operatorname{rowsp}(AB)$ can be reused as a basis for $\operatorname{rowsp}(B).$ ?

$Q3.$ How'd one previse to invert $AB$ and apply $\#5$ (the key strategem) for #6?

$Q4.$ What's the intuition behind results $5$ and $6$? I'd be grateful for pictures.

Sources: P147, 4.48, Schaum's Outline to Lin Alg, web.mit.edu/18.06/www/Spring01/Sol-S01-5.ps


The following results are facts of linear algebra:

  1. Given a basis, every vector in the vector space is uniquely expressible as a linear combination of the basis elements.
  2. Every linearly independent set of vectors can be extended into a basis. (Mathematically: if $\mathrm{X}_0$ is linearly independent, then there exists $\mathrm{X}_0 \subset \mathrm{X}$ such that $\mathrm{X}$ is a basis.)
  3. Every set of generating vectors can be "trimmed" to obtain a basis. (Mathematically, if $\mathrm{X}$ generates a vector space, in the sense that the span of $\mathrm{X}$ is the vector space, then there exists some $\mathrm{X}_0 \subset \mathrm{X}$ that is a basis of the vector space.)
  4. If a basis of vector space has finitely many elements, then all bases have finitely many elements and the number of elements in every base coincide. This number is therefore known as "dimension" of the vector space.
  5. A linear transformation between vector spaces is a function that respects the linear structure. (Mathematically, if $T:\mathrm{V} \to \mathrm{W}$ is a linear transformation, then $T(v_1 + a v_2) = T(v_1) + a T(v_2),$ inside $T$ the sum is in $\mathrm{V}$ and on the right hand side the sum is in $\mathrm{W}.$)
  6. The composition of two linear functions is again a linear function.
  7. Fixing (assumed finite) bases in the domain and codomain of a linear function, there exists a unique representation of said linear function as a matrix (which is relative to the chosen bases). That is, if $\mathrm{X}$ and $\mathrm{X}'$ are corresponding bases of $\mathrm{V}$ and $\mathrm{V}'$, there exists a unique expansion of $T.$ To find it, note that $T(x_i)$ is in $\mathrm{V}'$ and it is uniquely expandable in terms of the basis $\mathrm{X}'$ as follows: $$ T(x_i) = \sum b_{i,j} x_j' $$ Next, every vector $v \in \mathrm{V}$ is uniquely expressible as follows $v = \sum a_i x_i$ and therefore linearity of $T$ shows that $T(v)$ is uniquely expressible as $$ T(v) = \sum_i a_i T(x_i) = \sum_{i,j} a_i b_{i,j} x_j' $$
  8. It is common practice to use "coordinates" given a basis $\mathrm{X},$ that is, instead of writting $v = \sum a_i x_i,$ we write $[v]_\mathrm{X} = [a_1, \ldots, a_p]^\intercal$ (say $\mathrm{V}$ has dimension $p$, so $\mathrm{X}$ has $p$ elements).
  9. Using coordinates, it can be shown (just expand out!) by writting $[T]_{\mathrm{X}}^{\mathrm{X}'} = (b_{i,j}),$ $$ [T(v)]_{\mathrm{X}'} = [T]_{\mathrm{X}}^{\mathrm{X}'} [v]_\mathrm{X} $$ In plain English, matrix product with a vector respect the coordinate assignation relative to the chosen bases of the domain and codomain.
  10. By virtue of the foregoing, it can be seen similarly that $[ST]_{\mathrm{X}}^{\mathrm{X}''} = [S]_{\mathrm{X}'}^{\mathrm{X}''}[T]_{\mathrm{X}}^{\mathrm{X}'}$ for linear transformations $T:{\mathrm{V}} \to {\mathrm{V}'}$ and $S:{\mathrm{V}'} \to {\mathrm{V}''}$, the vector spaces $\mathrm{V}, \mathrm{V}'$ and $\mathrm{V}''$ having respective bases $\mathrm{X}, \mathrm{X}'$ and $\mathrm{X}''.$
  11. A linear transformation can never create linear independence, at most can preserve it. That is, if $\mathrm{K}$ is a linearly independent set with $k$ vectors then $T(\mathrm{K})$ has at most $k$ vectors and these need not be linearly independent.
  12. The image of a linear transformation is a vector subspace of the codomain.
  13. The rank of a linear transformation, by definition, the dimension of its image. It can be shown that if $A$ is a matrix representing a linear transformation (for some basis chosend ad-hoc), then the rank of $A$ (which is defined to be the dimension of its column-space) coincide with the rank of the linear transformation.
  14. Finally, by virtue of the foregoing, asking about rank of matrices is the same as asking about rank of linear transformations. Then, if $\mathrm{V}$ is of dimension $p$, and $T$ is linear, then $T(\mathrm{V})$ is of dimension $\leq p.$ But since $T(\mathrm{V}) \subset \mathrm{V}'$, then $T(\mathrm{V})$ is also of dimension $\leq p'.$ Thus, the rank of $T$ is $\leq \min(p, p').$ Next, $ST(\mathrm{V}) = S(T(\mathrm{V}))$ and therefore the image by $S$ of $T(\mathrm{V})$ must have dimension $\leq \min(p,p'),$ which respond your question.

There is a lot of baggage in linear algebra but otherwise it is straightforward. Another very important observation to make is that generally you want to understand problems using linear trasnformation as they have a far richer language and structure, but you want to do application using matrices. So, showing things related to spaces, projections, decompositions, etc, are far better understood using linera transformation, but doing application is more useful using matrices.