what is the relation between quaternions and imaginary numbers?

I understand the idea behind complex and imaginary numbers. I am trying to understand quaternions.

What is the relation between imaginary (or complex) numbers, and quaternions?


Solution 1:

EDITED

Let's start with the real line $\mathbb{R}$ constituted of real numbers $t$. This field is not algebraically closed from the absence of a real solution to the simple polynomial equation : $$\tag{1} X^2=-1$$

A solution of $(1)$ named $\,\mathbf{i}\,$ was invented and supposed belonging to the (imaginary) orthogonal to the real line at $0$. The 'numbers' $\;t+x\,\mathbf{i}\;$ with $\,t,x\,$ real constituting the field $\,\mathbb{C}$. This field is algebraically closed and nothing further needed to be added at this point.

But Hamilton wanted to get out of the plane and imagined another independent solution $\,\mathbf{j}\,$ of $(1)$ belonging to the orthogonal to the real and imaginary lines at $0$.
Playing with these new 'extended numbers' $\;t+x\,\mathbf{i}+y\,\mathbf{j}\;$ filling the 3D space he got into trouble with multiplication since the product of two 'extended numbers' contained the products $\,\mathbf{i}\,\mathbf{j}\,$ and $\,\mathbf{j}\,\mathbf{i}$.

Suppose that $\,\mathbf{i}\,\mathbf{j}\,$ belongs to the 3D space : this means that $\;\mathbf{i}\,\mathbf{j}=a+b\,\mathbf{i}+c\,\mathbf{j}\;$ but if we multiply this (at the left) by $\mathbf{i}\,$, replace $\mathbf{i}\,\mathbf{j}\,$ by the initial expression and simplify everything then we get $\mathbf{j}$ as $\;\mathbf{j}=\alpha +\beta\,\mathbf{i}\;$ : the 3D space collapsed back to the initial complex plane!

From the distributivity and associativity of the product we can't thus have two independent $\mathbf{i}\,$ and $\mathbf{j}\,$ without their product $\;\mathbf{k}:=\mathbf{i}\,\mathbf{j}\,$ being independent too : the 4D space generated by $\,(1,\,\mathbf{i},\,\mathbf{j},\,\mathbf{k})\,$ is mandatory!

And what about $\,\mathbf{j}\,\mathbf{i}$?

If we suppose that $\,\mathbf{i}\,\mathbf{j}=\mathbf{j}\,\mathbf{i}\,$ then everything will remain commutative with $\,\mathbf{k}^2=1\,$ and we obtain what was named tessarines by James Cockle and later bicomplex numbers that were further generalized to multicomplex numbers ($\,\mathbf{k}$ itself defining a split-complex number).

But before all this Hamilton noticed that $\,\left(x\,\mathbf{i}+y\,\mathbf{j}\right)^2=xy\;(\mathbf{i}\,\mathbf{j}+\mathbf{j}\,\mathbf{i})-(x^2+y^2)\,$ so that abandoning commutativity and supposing that $\,\mathbf{k}=\mathbf{i}\,\mathbf{j}=-\mathbf{j}\,\mathbf{i}\;$ he could rewrite not only $\,\left(x\,\mathbf{i}+y\,\mathbf{j}\right)^2=-(x^2+y^2)\,$ but also $\,\mathbf{k}^2=-1\,$ and $\,\left(x\,\mathbf{i}+y\,\mathbf{j}+z\,\mathbf{k}\right)^2=-(x^2+y^2+z^2)$.

The 3D space he wished was thus generated by $\,(\mathbf{i},\,\mathbf{j},\,\mathbf{k})\,$ instead of $\,(1,\,\mathbf{i},\,\mathbf{j})\,$ and he gained further an interesting fourth dimension! He was proud enough of his discovery to carve his very nice result into the stone of Brougham Bridge : $$\boxed{\displaystyle\mathbf{i}^2=\mathbf{j}^2=\mathbf{k}^2=\mathbf{i}\mathbf{j}\mathbf{k}=-1}$$

Solution 2:

Here is a concrete view through matrices.

Preliminary: Let us recall that one can identify complex numbers with a certain family of matrices, in the following isomorphic way:

$$\tag{1}u+iv \ \ \longleftrightarrow \ \ \left(\begin{array}{rr}u&-v\\v&u\end{array}\right)$$

Remark: this "isomorphism" can be used to define complex numbers.

In a concrete way, it means that every operation that can be done on complex numbers can be "reflected" as an operation on matrices, for example multiplication in $\mathbb{C}$ is in correspondence with matrix multiplication:

$$(u+iv)(u'+iv')\ \longleftrightarrow \ \left(\begin{array}{rr}u&-v\\v&u\end{array}\right)\left(\begin{array}{rr}u'&-v'\\v'&u'\end{array}\right)=\left(\begin{array}{rr}uu'-vv'&-(u'v+uv')\\u'v+uv'&uu'-vv'\end{array}\right)$$

As a very elementary exercise, what is the matrix operation associated with the conjugation operation ?

Now, let us turn to quaternions. A quaternion can be defined under the form of the following $4 \times 4$ matrix:

$$\tag{2}Q \ := \ \left(\begin{array}{rrrr}d&-c&b&a\\c&d&-a&b\\-b&a&d&c\\-a&-b&-c&d\end{array}\right) \ \ = \ \ dI_4+(a \mathbb{I}+b\mathbb{J}+c\mathbb{K}) $$

with $ \ \ \ \mathbb{I}:=\left(\begin{array}{rrrr}0&0&0&1\\0&0&-1&0\\0&1&0&0\\-1&0&0&0\end{array}\right), \ \ \mathbb{J}:=\left(\begin{array}{rrrr}0&0&1&0\\0&0&0&1\\-1&0&0&0\\0&-1&0&0\end{array}\right), \ \ \mathbb{K}:=\left(\begin{array}{rrrr}0&-1&0&0\\1&0&0&0\\0&0&0&1\\0&0&-1&0\end{array}\right).$

$dI_4$ is called the real part of $Q$ and $a \mathbb{I}+b\mathbb{J}+c\mathbb{K}$ its vector part. Identity matrix $I_4$ has to be "thought" as real number $1$ (this notation will be used in particular in the following table). The set of quaternions is denoted $\mathbb{H}$, with the following multiplication table (for example I*J=K): $$\ \begin{array}{r|rrr}*&I&J&K\\ \hline I&-1&K&-J\\J&-K&-1&I\\K&J&-I&-1\end{array}$$

In order to answer your question, I am going to show three "spots" where one can observe a relationship between quaternions and complex numbers.

First spot : it is linked with the fact (see table upwards) that $\mathbb{I}^2=-1$, $\mathbb{J}^2=-1$, $\mathbb{K}^2=-1$. Thus there are (at least) three copies of $\mathbb{C}$ in $\mathbb{H}$: a copy spanned by $(1,\mathbb{I})$, one by $(1,\mathbb{J})$, and one by $(1,\mathbb{K})$.

Second spot : let us consider the following block decomposition of (2):

$$\tag{3}\left(\begin{array}{rr|rr}d&-c&b&a\\c&d&-a&b\\ \hline -b&a&d&c\\-a&-b&-c&d\end{array}\right) \ \ \longleftrightarrow \ \ \left(\begin{array}{rr}\alpha &-\bar \beta \\ \beta & \bar \alpha\end{array}\right) $$

with $\alpha := d+ic$ and $\beta:=-b-ia.$ (using the correspondence described in (1)).

For (3), as was the case for (1), it is an isomorphism : quaternion computations are transcripted as computations on $2 \times 2$ complex matrices (of the particular type given by (3)).

It is why, in some books of the first part of the XXth century (russian books in particular), quaternions are described as hypercomplex numbers ; computations are eased by the fact that one can write (still identifying $I_4$ and $1$):

$$d1+aI+bJ+cK=(d+bJ)+I(a+cJ)$$

it is a "complex of complexes" (a very obsolete expression) using two different analogs ("avatars" $I$ and $J$) of complex number $i$.

Third spot : Quaternion matrix (2) can be partitioned in a different way:

$$\left(\begin{array}{rrr|r}d&-c&b&a\\c&d&-a&b\\-b&a&d&c\\ \hline -a&-b&-c&d\end{array}\right)=\left(\begin{array}{cc}[N]_{\times}+dI_3 &N \\ -N^T & d\end{array}\right) \tag{3'}$$

where $N$ is identified with the vector part $a \mathbb{I}+b\mathbb{J}+c\mathbb{K}$ and $[N]_{\times}$ denotes the $3 \times 3$ "cross product" operator defined by

$$[N]_{\times}V=N \times V \ \ \ \ \text{with} \ \ \ \ [N]_{\times}=\left(\begin{array}{rrr}0&-c&b\\c&0&-a\\-b&a&0\\\end{array}\right).$$

This partition gives access to the vectorial use of quaternions (that has found a new life with robotics). In particular, when the following product is expanded (by blocks):

$$\left(\begin{array}{cc}[N]_{\times}+dI_3 &N \\ -N^T & d\end{array}\right) \left(\begin{array}{cc}[N']_{\times}+d'I_3 &N' \\ -N'^T & d'\end{array}\right)=\left(\begin{array}{cc}[N'']_{\times}+d''I_3 &N'' \\ -N''^T & d''\end{array}\right),$$

one obtains, writing it in the following symboling way, where the real part and the vector part components are separated:

$$\tag{4}[d,N]*[d',N']=[dd'-N^TN',dN'+d'N+N \times N']$$

Quaternionic multiplication (4) strongly reminds complex numbers multiplication:

$$\tag{5}(d+iN)*(d'+iN')=(dd'-NN')+i(dN'+d'N)$$

but for the complementary term represented by cross product $N \times N'$, somehow reflecting the non-commutativity of quaternionic product.

$Remark$: It is well known that, using (2),

$$\tag{6}\det(Q)=(a^2+b^2+c^2+d^2)^2$$

This quantity is called the square of the quaternion norm. One finds back this quaternion norm as the determinant of the $2 \times 2$ complex matrix (3) is $|\alpha|^2+|\beta|^2=a^2+b^2+c^2+d^2$.

Edit : 1) see the splendid text of John Baez.

  1. A correspondence between SO(3) and SU(2) can as well be described via matrix expressions:

$$\left(\begin{array}{cc|c}0&-z&y\\ z&0&-x\\ \hline -y&x&0\end{array}\right)\in SO(3) \ \ \ \ \leftrightarrow \ \ \ \ \left(\begin{array}{cc}iz&-x+iy\\ x+iy&-iz\end{array}\right)\in SU(2)$$

(besides, see this).

  1. Partition (3') can be used for an easy retrieval of relationships between curl, rot, grad, and laplacian operators :

$$\text{Recall :} \ \ \left\{\begin{array}{rcr} \text{rot grad} &=& 0\\ \text{div rot} &=&0\\ \text{div grad} &=&\Delta\\ \text{ rot rot} \ - \ \text{grad div}&=& -\Delta\end{array}\right. \tag{6}$$

For this purpose, let us introduce the following matrix (see paragraph "full nabla" in this reference:

$$A:=\left(\begin{array}{rrr|r} 0&-\tfrac{\partial}{\partial z}&\tfrac{\partial}{\partial y}&\tfrac{\partial}{\partial x}\\ \tfrac{\partial}{\partial z}&0&- \tfrac{\partial}{\partial x}&\tfrac{\partial}{\partial y}\\ -\tfrac{\partial}{\partial y}&\tfrac{\partial}{\partial x}&0&\tfrac{\partial}{\partial z}\\ \hline -\tfrac{\partial}{\partial x}&- \tfrac{\partial}{\partial y}&-\tfrac{\partial}{\partial z}&0\end{array}\right)=\left(\begin{array}{c|c}\text{rot}&\text{grad}\\ \hline \text{-div}&0\end{array}\right) \tag{3''}$$

with property :

$$A^2= \left(\begin{array}{c|c}\text{rot}&\text{grad}\\ \hline \text{-div}&0\end{array}\right) \times \left(\begin{array}{c|c}\text{rot}&\text{grad}\\ \hline \text{-div}&0\end{array}\right)=\left(\begin{array}{c|c}-\Delta&0\\ \hline 0&-\Delta\end{array}\right)$$

grouping the four hard-to-memorize formulas (6) into a single one...

See as well this connected matricial presentation of Maxwell equations.

  1. There are different extensions, for example this type of matrix ("hypercomplex of hypercomplex"...) :

$$M=\left(\begin{array}{cc|cc}1 + x_5&x_1 + ix_3& x_2 + ix_4& 0\\ x_1 - ix_3& 1 - x_5& 0& -x_2 - ix_4\\ \hline x_2 - ix_4& 0&1 - x_5& x_1 + ix_3\\ 0&-x_2 + ix_4& x_1 - ix_3& 1 + x_5\end{array}\right)$$

is such that (see the analogy with (6)):

$$\det(M)=(1-x_1^2-x_2^2-x_3^2-x_4^2-x_5^2)^2$$

  1. Historical remark : In the second part of the 19th century, quaternions have triggered a huge interest. This craze has gradually faded once vector operations have spread, due in particular to Gibbs who introduced the (anti-commutative) cross-product (no-one before him has had the idea to consider such an operation !). See the quite limpid style of exposition of these lecture notes of Gibbs.

  2. For Clifford algebra connection, see this didactic article.

  3. See as well this interesting historical article.

Solution 3:

Both arise from the Cayley-Dickson construction of algebras.

If you're aware of the connection of $\mathbb C$ with rotations of $\mathbb R^2$, then you can draw a parallel with $\mathbb H$ since you can do rotations of $\mathbb R^3$ with quaternions.

Along with $\mathbb R$, $\mathbb C$ and $\mathbb H$ are the only finite dimensional, associative $\mathbb R$ division algebras.

While $\mathbb H$ contains many copies of $\mathbb C$, it isn't a $\mathbb C$ algebra. None of the copies lie in the center of $\mathbb H$, which is $\mathbb R$.

Solution 4:

The complex numbers $\mathbb{C}$, as a real vector spaces, is spanned by $1$ and $i$. It is thus a two-dimensional. So every complex number looks like $a+bi$ for some real numbers $a$ and $b$. An imaginary number is $bi$ for some real $b$.

The quaternions $\mathbb{H}$, as a real vector space, is spanned by $1,\mathbf{i},\mathbf{j},\mathbf{k}$ (so in particular, it is four-dimensional). So every quaternion looks like $a+b\mathbf{i}+c\mathbf{j}+d\mathbf{k}$ for some real numbers $a,b,c,d$, and the imaginary quaterions look like $b\mathbf{i}+c\mathbf{j}+d\mathbf{k}$. If we identify $1,i\in\mathbb{C}$ with $1,\mathbf{i}\in\mathbb{H}$, we can treat $\mathbb{C}\subset\mathbb{H}$ as a subset (indeed, a real vector subspace).

The real and imaginary parts of a complex number $a+bi$ are $a$ and $b$. For the imaginary part, we usually only talk about the real scalar $b$ that appears in front of $i$ without including the $i$ itself, but with quaternions $\mathbb{H}$ the imaginary quaternions are not all real multiples of some fixed quaternion (as every imaginary complex number is a multiple of $i$), so this can't be done. We say the real and imaginary parts of a quaternion $a+b\mathbf{i}+c\mathbf{j}+d\mathbf{k}$ are $a$ and $b\mathbf{i}+c\mathbf{j}+d\mathbf{k}$ respectively.

There is a norm $|x+yi|=\sqrt{x^2+y^2}$ which is multiplicative on $\mathbb{C}$ (i.e. $|zw|=|z||w|$ for all complex numbers $z,w$). This extends to a norm on quaternions,

$$|a+b\mathbf{i}+c\mathbf{j}+d\mathbf{k}|=\sqrt{a^2+b^2+c^2+d^2}.$$

The multiplication table for $1,\mathbf{i},\mathbf{j},\mathbf{k}$ can be figured out by the following mnemonic:

$\hskip 1.7in$ image

Automatically, $\mathbf{i},\mathbf{j},\mathbf{k}$ are three different square roots of $-1$. Multiplying two of them "in order" (as depicted above) yields the third one, whereas multiplying them "against" the order yields the opposite. So for instance $\mathbf{ij}=\mathbf{k}$ but $\mathbf{ji}=-\mathbf{k}$.

Using the distributive property and this multiplication table, we can multiply any two quaternions together $(a_1+b_1\mathbf{i}+c_1\mathbf{j}+d_1\mathbf{k})(a_2+b_2\mathbf{i}+c_2\mathbf{j}+d_2\mathbf{k})$.

Exercise 1. Multiply the above out, and write as $\square+\square\mathbf{i}+\square\mathbf{j}+\square\mathbf{k}$.

Exercise 2. Verify $|uv|=|u||v|$ holds for all quaternions $u,v\in\mathbb{H}$.

In fact, trying to get the norm to be multiplicative is exactly the motivation Hamilton had when he discovered / invented the quaternions.

If we think of the subspace of $\mathbb{H}$ of imaginary quaternions, $\mathrm{span}\{\mathbf{i},\mathbf{j},\mathbf{k}\}$, as just $\mathbb{R}^3$ comprised of vectors, then we can think of quaternions as scalars plus vectors.

Exercise 3. Verify $\mathbf{u}\mathbf{v}=-\mathbf{u}\cdot\mathbf{v}+\mathbf{u}\times\mathbf{v}$, where $\mathbf{u},\mathbf{v}\in\mathbb{R}^3\subset\mathbb{H}$ are vectors, $\cdot$ is the dot product and $\times$ is the cross product.

This allows a "coordinate-free" definition of quaternions. If we start with an oriented three-dimensional inner product space $V$, then the cross product is automatically defined on it simply from its geometric properties, and then multiplication on $\mathbb{R}\oplus V$ may be defined by first distributing $(a+\mathbf{u})(b+\mathbf{v})=ab+a\mathbf{v}+b\mathbf{u}+\mathbf{uv}$ and then using this formula for $\mathbf{uv}$. Just food for thought.

Exercise 4. Verify the quaternion solutions to $x^2=-1$ are precisely the pure imaginary quaternions with unit norm.

Given any unit quaternion $q=a+b\mathbf{i}+c\mathbf{j}+d\mathbf{k}$, we have $a=\cos\theta$ and $\sqrt{b^2+c^2+d^2}=\sin\theta$ for some angle $\theta$, in which case we have $q=\cos\theta+(\sin\theta)\mathbf{u}$ for some unit pure imaginary quaternion $\mathbf{u}$. Since $\mathbf{u}$ is a square root of $-1$, algebraically it behaves just like $i\in\mathbb{C}$, so this is $q=\exp(\theta\mathbf{u})$ (de Moivre's formula). Now if $q$ is not a unit quaternion, then we may write $q=rp$ where $r=|q|$ and $p$ is a unit quaternion, which in turn means we can write $q=re^{\theta\mathbf{u}}$. Thus, polar form generalizes to quaternions too.

In $\mathbb{C}$, the unit complex numbers form a standard unit circle $\mathbb{S}^1$, and moreoever they form a group (we can multiply them, etc.). In $\mathbb{H}$, the unit quaternions correspond to solutions to the equation $a^2+b^2+c^2+d^2=1$, which is a three-dimensional sphere sitting inside four-dimensional space ($\mathbb{H}$) called the $3$-sphere $\mathbb{S}^3$. This also is a group (we can multiply them, etc.).

Now, in $\mathbb{C}$, multiplication by a unit complex number $e^{i\theta}$ has the nice interpretation as rotation by the angle $\theta$. How does this generalize with quaternions? It turns out, using unit quaternion multiplication we can generated both 3D rotations and 4D rotations. I explained the 3D case here, and both 3D and 4D cases are explained in the early parts of Stillwell's Naive Lie Theory (although he seems to accidentally and implicitly apply functions from the right sometimes).

One final note. Since $\mathbb{C}\cong\mathbb{R}^2$ as real vector spaces, and to every complex number $z\in\mathbb{C}$ there is an associated multiplication map $L_z(w):=zw$, every complex number $z$ can be associated a real $2\times 2$ matrix. Not only this, but $\mathbb{C}\to M_2(\mathbb{R})$ will respect addition and multiplication, so we can view complex numbers as a subalgebra of $M_2(\mathbb{R})$.

The same can be done with quaternions, in which they can be viewed as an algebra $4\times 4$ real matrices. (We can also view $\mathbb{H}$ as a right $\mathbb{C}$-vector space in order to view $\mathbb{H}$ as a real algebra of $2\times 2$ complex matrices, but this is trickier and conventions seem to differ between every source.)