Proving Distributivity of Matrix Multiplication

Solution 1:

The proof of this fact really isn't so terrible at all: $$ (A(B+C))_{ij} = \sum_{k=1}^n A_{ik}(B+C)_{kj} = \sum_{k=1}^n A_{ik}(B_{kj}+C_{kj}) = \sum_{k=1}^n A_{ik}B_{kj} + \sum_{k=1}^n A_{ik}C_{kj}\\ = (AB)_{ij} + (AC)_{ij} = (AB+AC)_{ij}. $$ However, I entirely agree that this proof isn't particuarly enlightening. At the end of the day, the essential fact about matrix multiplication is this:

For $A \in M_{m \times n}(F)$ an $m \times n$ matrix, let $L_A : F^n \to F^m$ be the linear transformation given by $L_A(x) := A x$ for $x \in F^n$ a column vector. Then for any $A \in M_{m \times n}(F)$ and $B \in M_{n \times p}(F)$, the composition $L_A \circ L_B : F^m \to F^p$ of the linear transformations $L_B : F^p \to F^n$ and $L_A : F^n \to F^m$ is given by $$L_A \circ L_B = L_{AB}.$$

So, matrix multiplication is just the image of composition of linear transformations under the identification of matrices with linear transformations. In particular, then, distributivity of matrix multiplication is really just distributivity of composition of linear transformations, which lends itself to a far more transparent proof:

If $S : V_2 \to V_3$ and $T$, $U : V_1 \to V_2$, then for any $x \in V_1$, $$ (S \circ (T + U))(x) = S((T+U)(x)) = S(T(x)+U(x)) = S(T(x)) + S(U(x))\\ = (S \circ T)(x) + (S \circ U)(x) = (S \circ T + S \circ U)(x), $$ and hence $S \circ (T+U) = S \circ T + S \circ U$.

Solution 2:

We can think in terms of linear transformations rather than matrices.

Let $L_A$ be the linear transformation defined by $L_A(x) = Ax$. Let $L_B$ and $L_C$ be defined similarly. Note that \begin{equation} L_A \circ (L_B + L_C) = L_A \circ L_B + L_A \circ L_C. \end{equation} It follows that $A(B + C) = AB + AC$.

In this proof, I'm assuming that "the matrix of the composition is the product of the matrices." (Matrix multiplication is defined so that this is true.) I'm also assuming that "the matrix of the sum is the sum of the matrices". And I'm assuming that if two linear transformations are equal, then their matrices (with respect to given bases) are equal.

Solution 3:

Instead of introducing lots of indices, you can reduce the verification to simpler cases that do not need them. Since the coefficient in row$~i$ and column$~j$ of a matrix product $A\cdot B$ depends only on row$~i$ of$~A$ and only on column$~j$ of$~B$, matrix products can be split up along the rows of their first factor, and along the columns of the second factor. To express this more formally, let $\def\ro{\operatorname{row}_i}\ro$ denote the operation of extracting row$~i$ from a matrix and $\def\co{\operatorname{col}_j}\co$ the operation of extracting column$~j$, then $$ \ro(A\cdot B)=\ro(A) \cdot B \qquad\text{and}\qquad \co(A\cdot B)=A \cdot \co(B). $$ Now a matrix equation holds if and only if the result of applying any operation $\ro$ to both sides holds, and also if and only if the result of applying any operation $\co$ to both sides holds.

Using this, the problem of the question is reduced to proving it in the special case where $A$ is an $1\times n$ matrix (it has one row), and $B,C$ are both $n\times 1$ matrices (they each have one column). The proof becomes straightfoward: $\sum_{i=1}^na_i(b_i+c_i)=\sum_{i=1}^n(a_ib_i+a_ic_i)$. There is still one index, but this is hard to avoid since you must use the definition of matrix multiplication somewhere, and that definition involves a summation.