Complexity of computing $ABx$ depends on the order of multiplication

Solution 1:

I like this question. I've seen the solution to efficiently calculating a chain of matrix multiplications in an algorithms class, but it's not very enlightening. I've always meant to think more about the problem because associativity is interesting. So, let me try my hand at an explanation.

When you have an $N$ by $N$ matrix, it's a shorthand for describing a linear transformation from an $N$-dimensional vector space to itself. It does that by keeping track of where $N$ basis vectors get sent (these are the columns of your matrix).

Now, I'll just speak heuristically and it won't be very precise. I'll say that to write down a matrix, you just need to keep track of where $N$ linearly independent vectors get sent, not necessarily a basis. This is true, but all sorts of actual computational things depend on the exact basis (like actually writing down a vector, or doing standard calculations like $Bx$).

When you compute $AB$, you're keeping track of where $N$ linearly independent vectors get sent. But when you go on and use your freshly-computed $AB$ to compute $(AB)x$, you just care about where $1$ vector gets sent -- even though you went through all the trouble of keeping track of what $AB$ does to $N - 1$ other vectors!

Computing $A(Bx)$ is great, because at each step, it only pays attention to what's happening to the vectors you care about; it has no idea what happens to the other $N - 1$ linearly independent vectors, and so it only takes about one $N$th as much work as computing $(AB)x$.

While not exactly precise, I'd like to believe this is the idea behind why computing $A(Bx)$ is more efficient than $(AB)x$, which is what I think you are after.

Solution 2:

Yes you are correct, this is used a lot in engineering. Two popular cases:

  1. Knowing $AB$ and trying to find $A$ and $B$ which individually contain larger percent 0s than $AB$ multiplied together does. Usually called factoring or "fast transforms". If we lose enough non-zero numbers we can save calculations in this way.
  2. Precomputing $AB$ if we have individual $A$ and $B$ and want to calculate $ABv$ for lots of $v$:s. This should be done if $AB$ is simpler than $A$, $B$ separately.