Why does $A^TA=I, \det A=1$ mean $A$ is a rotation matrix?

Solution 1:

Others have raised some good points, and a definite answer really depends what kind of a linear transformation do we want call a rotation or a reflection.

For me a reflection (may be I should call it a simple reflection?) is a reflection with respect to a subspace of codimension 1. So in $\mathbf{R}^n$ you get these by fixing a subspace $H$ of dimension $n-1$. The reflection $s_H$ w.r.t. $H$ keeps the vectors of $H$ fixed (pointwise) and multiplies a vector perpendicular to $H$ by $-1$. If $\vec{n}\perp H$, $\vec{n}\neq0$, then $s_H$ is given by the formula

$$\vec{x}\mapsto\vec{x}-2\,\frac{\langle \vec{x},\vec{n}\rangle}{\|\vec{n}\|^2}\,\vec{n}.$$

The reflection $s_H$ has eigenvalue $1$ with multiplicity $n-1$ and eigenvalue $-1$ with multiplicity $1$ with respective eigenspaces $H$ and $\mathbf{R}\vec{n}$. Thus its determinant is $-1$. Therefore geometrically it reverses orientation (or handedness, if you prefer that term), and is not a rigid body motion in the sense that in order to apply that transformation to a rigid 3D body, you need to break it into atoms (caveat: I don't know if this is the standard definition of a rigid body motion?). It does preserve lengths and angles between vectors.

Rotations (by which I, too, mean simply an orthogonal transformations with $\det=1$) have more variation. If $A$ is a rotation matrix, then Adam's calculation proving that the lengths are preserved, tells us that the eigenvalues must have absolute value $=1$ (his calculation goes through for a complex vectors and the Hermitian inner product). Therefore the complex eigenvalues are on the unit circle and come in complex conjugate pairs. If $\lambda=e^{i\varphi}$ is a non-real eigenvalue, and $\vec{v}$ is a corresponding eigenvector (in $\mathbf{C}^n$), then the vector $\vec{v}^*$ gotten by componentwise complex conjugation is an eigenvector of $A$ belonging to eigenvalue $\lambda^*=e^{-i\varphi}$. Consider the set $V_1$ of vectors of the form $z\vec{v}+z^*\vec{v}^*$. By the eigenvalue property this set is stable under $A$: $$A(z\vec{v}+z^*\vec{v}^*)=(\lambda z)\vec{v}+(\lambda z)^*\vec{v}^*.$$ Its components are also stable under complex conjugation, so $V_1\subseteq\mathbf{R}^n$. It is obviously a 2-dimensional subspace, IOW a plane. It is easy to guess and not difficult to prove that the restriction of the transformation $A$ onto the subspace $V_1$ is a rotation by the angle $\varphi_1=\pm\varphi$. Note that we cannot determine the sign of the rotation (clockwise/ccw), because we don't have a preferred handedness on the subspace $V$.

The preservation of angles (see Adam's answer) shows that $A$ then maps the $n-2$ dimensional subspace $V^\perp$ also to itself. Furthermore, the determinant of $A$ restricted to $V_1$ is equal to one, so the same holds for $V_1^\perp$. Thus we can apply induction and keep on splitting off 2-dimensional summands $V_i,i=2,3\ldots,$ such that on each summand $A$ acts as a rotation by some angle $\varphi_i$ (usually distinct from the preceding ones). We can keep doing this until only real eigenvalues remain, and end with the situation: $$ \mathbf{R}^n=V_1\oplus V_2\oplus\cdots V_m \oplus U, $$ where the 2D-subspaces $V_i$ are orthogonal to each other, $A$ rotates a vector in $V_i$ by the angle $\varphi_i$, and $A$ restricted to $U$ has only real eigenvalues.

Counting the determinant will then show that the multiplicity of $-1$ as an eigenvalue of $A$ restricted to $U$ will always be even. As a consequence of that we can also split that eigenspace into sums of 2-dimensional planes, where $A$ acts as rotation by 180 degrees (or multiplication by $-1$). After that there remains the eigenspace belonging to eigenvalue $+1$. The multiplicity of that eigenvalue is congruent to $n$ modulo $2$, so if $n$ is odd, then $\lambda=+1$ will necessarily be an eigenvalue. This is the ultimate reason, why a rotation in 3D-space must have an axis = eigenspace belonging to eigenvalue $+1$.

From this we see:

  1. As Henning pointed out, we can continuously bring any rotation back to the identity mapping simply by continuously scaling all the rotation angles $\varphi_i,i=1,\ldots,m$ continuously to zero. The same can be done on those summands of $U$, where $A$ acts as rotation by 180 degrees.
  2. If we want to define rotation in such a way that the set of rotations contains the elementary rotations described by Henning, and also insist that the set of rotations is closed under composition, then the set must consist of all orthogonal transformations with $\det=1$. As a corollary to this rotations preserve handedness. This point is moot, if we defined a rotation by simply requiring the matrix $A$ to be orthogonal and have $\det=1$, but it does show the equivalence of two alternative definitions.
  3. If $A$ is an orthogonal matrix with $\det=-1$, then composing $A$ with a reflection w.r.t. to any subspace $H$ of codimension one gives a rotation in the sense of this (admittedly semi-private) definition of a rotation.

This is not a full answer in the sense that I can't give you an 'authoritative' definition of an $n$D-rotation. That is to some extent a matter of taste, and some might want to only include the simple rotations from Henning's answer that only "move" points of a 2D-subspace and keep its orthogonal complement pointwise fixed. Hopefully I managed to paint a coherent picture, though.

Solution 2:

This depends on how we want to define "rotation" in the first place. Some people prefer a narrow definition where the only things that qualify as "rotations" are things that can be expressed as a $(2+n)\times(2+n)$ block matrix $$\pmatrix{\pmatrix{\cos\theta&\sin\theta\\-\sin\theta&\cos\theta}&0\\0&I}$$ with respect to some orthogonal basis. Under this definition there are $4\times 4$ matrices that are orthogonal and have determinant 1 but are not rotations - for example, $$\pmatrix{0.6&0.8&0&0\\-0.8&0.6&0&0\\0&0&0&1\\0&0&-1&0}$$

But one might also say that a "rotation" is any matrix $A$ such that there is a continuous family of matrices $(A_t)_{0\le t\le 1}$ such that $A_0=I$, $A_1=A$ and $A_t$ is always orthogonal. This captures the idea of a gradual isometric transformation away from a starting point. Such a definition immediately tells us that the determinant of a rotation must be 1 (because the determinant is continuous), but it is harder to see that we get all orthogonal matrices with determinant 1 this way.

I don't have a quick proof of the latter, but I imagine that it can be done fairly elementarily by induction on the dimension, first a series of rotations to make the first column fit, then sort out the remaining columns recursively working within the orthogonal complement of the first column.

Solution 3:

The reason why it is called a rotation matrix is because lengths and angles are preserved.

Consider any $v \in \mathbb{R}^n$. Now consider the $\ell_2$ norm of $Av$: $$ \|Av\|_2^2=v'A'Av = v'Iv=\|v\|_2^2. $$

Hence, the Euclidean length is invariant under an orthonormal linear transformation $A$. Furthermore, for any $u, v \in \mathbb{R}^n,$ we have the following equivalence in the inner products: $$ \langle Au, Av \rangle_2 = u' A'A v = u'Iv = u'v = \langle u, v \rangle_2. $$

Note that since the range of cosine is $[-1,1]$, we may simply define the cosine of two vectors in a Euclidean space to be the normalized inner product as:

$$ \cos \phi := \frac{\langle u, v \rangle_2}{\|u\|_2 \|v\|_2} $$

Given this definition and the preceding, it's easy to see that angles are preserved as well.

(Added clarification for difference between rotation and reflection in the comments).

Solution 4:

I think John Stillwell's book Naive Lie Theory pretty much summarises the required points in this case.

He starts off by stating that it follows from the Cartan-Dieudonne theorem that a rotation about $O$ in $\mathbb{R^2}$ or $\mathbb{R^3}$ is a linear transformation that preserves length and orientation. He then goes on to justify that a transformation preserves length iff it preserves the inner product. By defining the rotation criterion in $\mathbb{R^n}$ as you have stated, it is shown that $AA^T=I\iff A$ preserves the inner product for a square matrix $A$ of order $n$. The two solutions $\det(A)=1$ and $\det(A)=-1$ occur accordingly as $A$ preserves orientation or not. enter image description here enter image description here

Solution 5:

A perfect answer for your question can be found in the freely available paper: "A Disorienting Look at Euler’s Theorem on the Axis of a Rotation" By Bob Palais, Richard Palais, and Stephen Rodi.

In the paper, a linear algebra proof is discussed, that specifically deals with the relation between reflection, proper rotation, and the sign of the determinant. The proof is constructive in the sense that explicitly tackles the issue of invariant vectors compared to vectors that get reversed. As bonuses, Euler's E478 proof is recast into modern notation. A Topological proof is mentioned, as well as a Differential Geometry Lie Theory one.