Understanding the Gram-Schmidt process
I would like to better understand the gram-schmidt process. The statement of the theorem in my textbook is the following:
The Gram-Schmidt sequence $[u_1, u_2,\ldots]$ has the property that $\{u_1, u_2,\ldots, u_n\}$ is an orthonormal base for the linear span of $\{x_1, x_2, \ldots, x_k\}$ for $k\geq 1$. The formula for $\{u_1, u_2,\ldots, u_n\}$ is: \begin{equation} x_k = \left|\left| x_k - \sum\limits_{i<k}\langle x_k, u_i\rangle u_i \right|\right|_2^{-1} \left(x_k - \sum\limits_{i<k}\langle x_k, u_i\rangle u_i\right) \end{equation}
Note that I am primarily interested in how all of the vectors are orthogonal. The norm term in the above equation tells me that all the vectors will be unit vectors and hence we get an orthonormal set. Anyway, I see how this works algebraically; Let $v = x_k - \sum\limits_{i<k}\langle x_k, u_i\rangle u_i$. Now, take the dot product of $\langle v, u_j\rangle$ for some $j<k$: \begin{equation} \langle v, u_j\rangle = \langle x_k, u_j\rangle - \sum\limits_{i<k}\langle x_k, u_i\rangle\langle u_i, u_j\rangle \end{equation} When we assume in the induction hypothesis that we have an orthonormal basis for $i<k$ then the sum is zero except when $i=j$. This leaves us with: \begin{equation} \langle v, u_j\rangle = \langle x_k, u_j\rangle - \langle x_k, u_j\rangle = 0 \end{equation}
OK, I can logically follow algebra, but how can I see this geometrically? Can someone provide both 2D and 3D examples/plots? Since I am specifically interested in seeing how all the vectors meet at 90 degrees.
Consider the following diagram, courtesy of mathinsight.org:
You can think of $(a \cdot u) u$ as the piece of $a$ that is in the direction of $u$. The part that is left over, $a - (a \cdot u) u$, must naturally be the missing side of the triangle, and hence is perpendicular to $u$. So at each step of the Gram-Schmidt process, the formula
$$ v_{n+1} = a - \sum_{j=1}^n \langle a, u_j \rangle u_j, \quad u_{n+1} = v_{n+1}/ \|v_{n+1} \|$$
does the following: it first subtracts all the pieces of $a$ that are in the same direction as all the $u_j$, then it renormalizes. The resulting vector must be orthogonal to all the $u_j$'s since you just subtracted out all the pieces that were not perpendicular.
The geometric picture from Gram-Schmidt is this:
You start with a basis. Take the first vector. Scale it so that it's a unit vector. Good start. Take the second vector. If it's orthogonal to the first vector, great. otherwise, subtract off a multiple of the first vector until it is. Then scale it so that it's a unit vector. Moving on, take the third vector. Subtract off enough of the first vector from it so that it's orthogonal to the first vector now. Then subtract off enough of the second vector so it's orthogonal to that one, too. Now scale it so that it's a unit vector. Keep going like this, by taking the next vector and subtracting of bits of the previous vectors so that it's orthogonal to all of them, and then rescale so that it's a unit vector. The "bit" that you have to subtract off is the projection of the vector you're currently working on onto the unit vector, and the formula for that is given by the dot product.
You might find this animation helpful, but I actually found it a little difficult to follow. http://www.youtube.com/watch?v=pIy8xqh9sWs