What is an interpretation of the matrix exponential?

I just read about the existence of the "matrix exponential"

$$e^X := \sum_{k = 0}^\infty\frac1{k!}X^k$$

Is there a simple way to interpret this? I understand the analog between real number exponentials as infinite Taylor expansions. However, I have no easy way of interpreting in the case of a matrix.

I've read that it relates to linear ODE's


Solution 1:

Consider a complex diagonalizable $n \times n$ matrix. If $X = A D A^{-1}$ where $A$ is invertible and $D$ is diagonal, then it's easy to see that $$e^X = A e^D A^{-1}$$ and $$e^D = \mathrm{diag}(e^{d_{11}}, \ldots, e^{d_{nn}})$$

Thus, for diagonalizable matrices, it corresponds to exponentiating each eigenvalue.

There is also a general interpretation but it is less intuitive. Every complex $n \times n$ matrix can be written in Jordan canonical form. Since a matrix in Jordan canonical form is block diagonal with blocks of the form $$J(\lambda) = \begin{pmatrix}\lambda & 1 & & \\ & \lambda & \ddots & \\ & & \ddots & 1 \\ & & & \lambda \end{pmatrix}$$ where $\lambda$ is an eigenvalue of $X$, $e^X$ is similar to a block diagonal matrix consisting of the blocks $$e^{J(\lambda)} = \begin{pmatrix}e^\lambda & \frac{e^\lambda}{1!} & \frac{e^\lambda}{2!} & \cdots & \frac{e^\lambda}{(k - 2)!} & \frac{e^\lambda}{(k - 1)!} \\ & e^\lambda & \frac{e^\lambda}{1!} & \frac{e^\lambda}{2!} & \cdots & \frac{e^\lambda}{(k - 2)!} \\ & & \ddots & \ddots & \ddots & \vdots \\ & & & e^{\lambda} & \frac{e^{\lambda}}{1!} & \frac{e^{\lambda}}{2!} \\ & & & & e^{\lambda} & \frac{e^\lambda}{1!} \\ & & & & & e^{\lambda}\end{pmatrix}$$ where $J(\lambda)$ is $k \times k$.

Thus, in general, exponentiating a matrix corresponds to exponentiating each of its Jordan blocks.

In fact, this interpretation also holds for any analytic function $f$ applied to a matrix and not just $e^X$. In general, $f(J(\lambda))$ involves the derivatives of $f$. See this question and Wikipedia for more details.

Solution 2:

In ODEs, it's quite simple. For a single ODE $$y' = ay$$ We have that it's solution is $y(t) = y(0)e^{at}$. We can actually define the exponential function this way (as a solution to an ODE). If we let $\vec{y}(t) = (y_1(t),\dots,y_n(t))$, then the differential equation: $$\vec{y}' = Ay$$ where $A$ is now a matrix, is given by: $$e^{At}y(0)$$ So, it may be useful to think of the matrix exponential as the "Solution to the System of ODEs".

It's also used in Lie theory, as the connection between a Lie Algebra/Group, but if you don't already know what this is it's probably not useful to build intuition.

Solution 3:

Think about $\exp$ as a function which translates a (relative) infinitesimal additive change into a finite multiplicative change after one unit of time (this view comes from Lie groups/algebras). Sounds strange, and the connection between the infinitesimal change and the resulting change is not always obvious:

  • If you add nothing, your infinitesimal additive change is zero ($x\mapsto x+0x$). The corresponding finite multiplicative change is $\exp 0=1$.
  • If your infinitesimal change is a multiple of the current value, i.e. $x\mapsto x+\lambda x$, then your finite change after one unit of time is $\exp\lambda$.

This can be applied to matrices (and even more general constructs):

  • If your infinitesimal change is just adding a scaled version of the current vector, i.e. $x\mapsto x+(\lambda\Bbb I)x$ (wheere $\Bbb I$ is the identity matrix), then after one unit of time this corresponds to scaling $x$ by a factor $e^\lambda$, which corresponds to the map $x\mapsto(e^\lambda\Bbb I)x$. You see $\exp(\lambda\Bbb I)=e^\lambda \Bbb I$.
  • If your infinitesimal additive change does not point in the direction of the current vector $x$, but perpendicular to it, i.e. $x\mapsto x+\lambda Rx$ with $$R:=\begin{pmatrix}0&-1\\1&0\end{pmatrix},$$ then the corresponding multiplicative change is a rotation with angle $\lambda$ (in radians). If we express the rotation matrices by $R_\theta$, then we see $\exp(\lambda R)=R_{\lambda}$.

All this needs some intuition and quite a lot of it can be delivered from differential equations as decscribed by @Mark.

Solution 4:

Just one special case.

Imagine a rotation matrix 3x3 or a translation matrix 4x4. It will rotate a point around an axis in 3D and perhaps move (translate) it. These matrices are used in 3D operations, engineering, graphics, ballistics etc.. If the axis directional component of the translation is zero the point will be rotating the axis in a 3D plane and its path will form a circle given all possible rotation angles. In this case the matrix exponential is very easy to comprehend.

The original matrix R will describe a rotation around axis a of x degrees. a is the Eigen vector corresponding to the smallest Eigen value. ( in rotation matrices that value will be zero)

exp(2) of R describes a rotation of 2x around the same axis.

exp(0.5) describes a rotation of 0.5x around a.

exp(-1) = inv(R) describes a rotation of x on the opposite direction.

So the matrix exponential can be used to track or model the path of a rotating object. If the translation contains an axis directional component the path will be a helix rotating around the axis.