ARKit – What do the different columns in Transform Matrix represent?
Solution 1:
ARKit, RealityKit and SceneKit frameworks use
4 x 4
Transformation Matrices
to translate, rotate, scale and shear 3D objects (just likesimd_float4x4
matrix type). Let's see how these matrices look like.
In 3D Graphics we often use a 4
x4
Matrix with 16 useful elements. The Identity 4
x4
Matrix is as following:
Between those sixteen elements there are 6 different shearing coefficients:
shear XY
shear XZ
shear YX
shear YZ
shear ZX
shear ZY
In Shear Matrix they are as followings:
Because there are no Rotation coefficients
at all in this Matrix, six Shear coefficients
along with three Scale coefficients
allow you rotate 3D objects about X
, Y
, and Z
axis using magical trigonometry (sin
and cos
).
Here's an example how to rotate 3D object (CCW) about its Z
axis using Shear and Scale elements:
Look at 3 different Rotation patterns using Shear and Scale elements:
And, of course, 3 elements for translation (tx
,ty
,tz
) in the 4
x4
Matrix
are located on the last column:
┌ ┐
| 1 0 0 tx |
| 0 1 0 ty |
| 0 0 1 tz |
| 0 0 0 1 |
└ ┘
The columns' indices in ARKit, SceneKit and RealityKit are:
0
,1
,2
,3
.
The fourth column (index 3) is for translation values:
var translation = matrix_identity_float4x4
translation.columns.3.x
translation.columns.3.y
translation.columns.3.z
You can read my illustrated story about MATRICES on Medium.
Perspective and orthographic projections
Values located in the bottom row of a 4x4 matrix
is used for perspective projection.
And, of course, you need to see an example of how to setup an orthographic projection.
Solution 2:
If you're new to 3D then these transformation matrices will seem like magic. Basically, every "point" in ARKit space is represented by a 4x4 transform matrix. This matrix describes the distance from the ARKit origin (the point at which ARKit woke up to the world), commonly known as the translation, and the orientation of the device, aka pitch, roll, and yaw. A transform matrix can also describe scale, although typically you won't deal with scale until you render something.
What do the columns mean? That gets complicated but just remember that the first 3 elements of the 4th column are the x,y,z translation. That will come in handy. The rest holds scale and rotation information.