Finding the Transform matrix from 4 projected points (with Javascript)

I'm working on a project using Chrome - JS and Webkit 3D CSS3 transform matrix.

The final goal is to create a tool for artistic projects using projectors and animation - somewhat far away from using maths...

I'm using a projector to project several squares over several shapes - as seen in the picture.

What I would like to do is for the user to draw 4 points on the screen (2D x and y) and from there extract a matrix object that I could apply to a regular DIV element of a dimension of 100px by 100px.

By finding scaleX, scaleY, rotation, rotationX, rotationZ and probably some perspective that I could apply to the div to relatively match the surface.

I'm not really familiar with geometry beyond sin and cos, and 3D in general and don't even know if that is something doable. If anyone could help me get started or point me in the right direction, I would greatly appreciate.

Here is an animated gif (click to enlarge), hoping it makes it more clear to understand.

http://www.michael-iriarte.com/code/duncan.gif


Solution 1:

Computing a projective transformation

A projective transformation of the (projective) plane is uniquely defined by four projected points, unless three of them are collinear. Here is how you can obtain the $3\times 3$ transformation matrix of the projective transformation.

Step 1: Starting with the 4 positions in the source image, named $(x_1,y_1)$ through $(x_4,y_4)$, you solve the following system of linear equations:

$$\begin{pmatrix} x_1 & x_2 & x_3 \\ y_1 & y_2 & y_3 \\ 1 & 1 & 1 \end{pmatrix}\cdot \begin{pmatrix}\lambda\\\mu\\\tau\end{pmatrix}= \begin{pmatrix}x_4\\y_4\\1\end{pmatrix}$$

The colums form homogenous coordinates: one dimension more, created by adding a $1$ as the last entry. In subsequent steps, multiples of these vectors will be used to denote the same points. See the last step for an example of how to turn these back into two-dimensional coordinates.

Step 2: Scale the columns by the coefficients you just computed:

$$A=\left(\begin{array}{lll} \lambda\cdot x_1 & \mu\cdot x_2 & \tau\cdot x_3 \\ \lambda\cdot y_1 & \mu\cdot y_2 & \tau\cdot y_3 \\ \lambda & \mu & \tau \end{array}\right)$$

This matrix will map $(1,0,0)$ to a multiple of $(x_1,y_1,1)$, $(0,1,0)$ to a multiple of $(x_2,y_2,1)$, $(0,0,1)$ to a multiple of $(x_3,y_3,1)$ and $(1,1,1)$ to $(x_4,y_4,1)$. So it will map these four special vectors (called basis vectors in subsequent explanations) to the specified positions in the image.

Step 3: Repeat steps 1 and 2 for the corresponding positions in the destination image, in order to obtain a second matrix called $B$.

This is a map from basis vectors to destination positions.

Step 4: Invert $A$ to obtain $A^{-1}$ (or use the adjugate as discussed below).

$A$ maps from basis vectors to the source positions, so the inverse matrix maps in the reverse direction.

Step 5: Compute the combined Matrix $C = B\cdot A^{-1}$.

$A^{-1}$ maps from source positions to basis vectors, while $B$ maps from there to destination positions. So the combination maps source positions to destination positions. This is the matrix of the transformation you were requesting.

Step 6: To map a location $(x,y)$ from the source image to its corresponding location in the destination image, compute the product

$$\begin{pmatrix}x'\\y'\\z'\end{pmatrix} = C\cdot\begin{pmatrix}x\\y\\1\end{pmatrix}$$

These are the homogenous coordinates of your transformed point.

Step 7: Compute the position in the destination image like this:

\begin{align*} x'' &= \frac{x'}{z'} \\ y'' &= \frac{y'}{z'} \end{align*}

This is called dehomogenization of the coordinate vector.

How to use this projective transformation with CSS

In general such a transformation will not be an affine transformation, so you cannot express this in terms of affine transformations like scaling, rotating and shearing, since these cannot express perspectivity. You might however try to simply set the first two entries of the last row to zero, so you get an affine transformation which might be close enough to your desired transformation.

If on the other hand you can use a matrix3d transformation, then you can take the 2D projective transformation matrix $C$ computed as described above, and use its entries to build a 3D projective transformation matrix like this:

$$\begin{pmatrix} C_{1,1} & C_{1,2} & 0 & C_{1,3} \\ C_{2,1} & C_{2,2} & 0 & C_{2,3} \\ 0 & 0 & 1 & 0 \\ C_{3,1} & C_{3,2} & 0 & C_{3,3} \end{pmatrix}$$

This transformation will transform $x$ and $y$ coordinate as above, but leave the $z$ coordinates of the homogenous coordinate vectors alone. Dehomogenization might still change the value of the $z$ coordinate in space, but as you don't really care about these, this should be good enough.

I've written a proof-of-concept implementation. The user interface is pretty crude, but the math works well enough. The implementation there uses the adjugate matrix instead of the inverse, both in solving the linear equations in step 1 and for the reverse transform in step 4. The result differs only by a scalar factor, which is irrelevant for homogenous coordinates. The benefit is that this avoids computing a bunch of determinants and performing a bunch of divisions.

If you wanted to, you could play the same game for five points in 3D space, in order to compute the full spatial projective transformation matrix. But that only makes sense if you actually have depth, sice no four of the five points may be coplanar.

Solution 2:

I am fairly certain there is no closed-form general solution for this problem, which leaves you with numerical approximations. One simplifying technique would be to eliminate perspective from your problem and imagine the four points your visitors draw to be the shadow cast by a light shining from infinity. You could then, basically, guess and check, using guesses of whatever sophistication you desire.

I've written a really rough demo of what I'm talking about containing some unsophisticated assumptions. Because it removes perspective, it doesn't precisely solve your question, but see if the results might be satisfactory.

The user enters four points onto an HTML canvas with his or her mouse. The script then tries to iteratively converge on a CSS transform of a square to match the shape.

I've linked to some sample output. On the left, the yellow quadrilateral is the original hand drawing, the greys are the successive approximations, and the red is the final estimate. On the right, you see a square div styled with the CSS transformation.

Script output screenshot

An obvious upgrade would be a better convergence function, perhaps using Newton's Method or something of its ilk, but I haven't taken the time to figure out the partial differential equations this would require.

(The code runs in-browser in synchronous Javascript, and locks the browser on my computer for between 5 and 20 seconds on average, so be careful.)