Rigorously proving that a change-of-basis matrix is always invertible

How can we prove that all change-of-basis matrices are invertible? The trivial case when it's a change of basis for $\mathbb{R^{n}}$ is easily demonstratable using, for example, determinants. But I am struggling to rigorously show this for all bases, for example for a two-dimensional subspace of $\mathbb{R^{4}}$. I am sure that there are many ways to go about this proof, and I would be very appreciative for as many ways of demonstration as possible, to back up my intuition!


Solution 1:

What is a change-of-basis matrix? You have a vector space $\mathbf{V}$ (and it doesn't matter if $\mathbf{V}$ is all of $\mathbb{R}^n$, or some subspace thereof, or even something entirely different), and two different ordered bases for $\mathbf{V}$, $\beta_1$ and $\beta_2$ (necessarily of the same size, since two bases of the same vector space always have the same size): \begin{align*} \beta_1 &= \Bigl[ \mathbf{v}_1,\mathbf{v}_2,\ldots,\mathbf{v}_n\Bigr]\\ \beta_2 &= \Bigl[ \mathbf{w}_1,\mathbf{w}_2,\ldots,\mathbf{w}_n\Bigr]. \end{align*} A "change of basis" matrix is a matrix that translates from $\beta_1$ coordinates to $\beta_2$ coordinates. That is, $A$ is a change-of-basis matrix (from $\beta_1$ to $\beta_2$) if, given the coordinate vector $[\mathbf{x}]_{\beta_1}$ of a vector $\mathbf{x}$ relative to $\beta_1$, then $A[\mathbf{x}]_{\beta_1}=[\mathbf{x}]_{\beta_2}$ gives the coordinate vector of $\mathbf{x}$ relative to $\beta_2$, for all $\mathbf{x}$ in $\mathbf{V}$.

How do we get a change-of-basis matrix? We write each vector of $\beta_1$ in terms of $\beta_2$, and these are the columns of $A$: \begin{align*} \mathbf{v}_1 &= a_{11}\mathbf{w}_1 + a_{21}\mathbf{w}_2+\cdots+a_{n1}\mathbf{w}_n\\ \mathbf{v}_2 &= a_{12}\mathbf{w}_1 + a_{22}\mathbf{w}_2 +\cdots + a_{n2}\mathbf{w}_n\\ &\vdots\\ \mathbf{v}_n &= a_{1n}\mathbf{w}_1 + a_{2n}\mathbf{w}_2 + \cdots + a_{nn}\mathbf{w}_n. \end{align*} We know we can do this because $\beta_2$ is a basis, so we can express any vector (in particular, the vectors in $\beta_1$) as linear combinations of the vectors in $\beta_2$.

Then the change-of-basis matrix translating from $\beta_1$ to $\beta_2$ is $$ A = \left(\begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1n}\\ a_{21} & a_{22} & \cdots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{array}\right).$$

Why is $A$ always invertible? Because just like there is a change-of-basis from $\beta_1$ to $\beta_2$, there is also a change-of-basis from $\beta_2$ to $\beta_1$. Since $\beta_1$ is a basis, we can express every vector in $\beta_2$ using the vectors in $\beta_1$: \begin{align*} \mathbf{w}_1 &= b_{11}\mathbf{v}_1 + b_{21}\mathbf{v}_2 + \cdots + b_{n1}\mathbf{v}_n\\ \mathbf{w}_2 &= b_{12}\mathbf{v}_2 + b_{22}\mathbf{v}_2 + \cdots + b_{n2}\mathbf{v}_n\\ &\vdots\\ \mathbf{w}_n &= b_{1n}\mathbf{v}_n + b_{2n}\mathbf{v}_n + \cdots + b_{nn}\mathbf{v}_n. \end{align*} So the matrix $B$, with $$B = \left(\begin{array}{cccc} b_{11} & b_{12} & \cdots & b_{1n}\\ b_{21} & b_{22} & \cdots & b_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ b_{n1} & b_{nn} & \cdots & b_{nn} \end{array}\right),$$ has the property that given any vector $\mathbf{x}$, if $[\mathbf{x}]_{\beta_2}$ is the coordinate vector of $\mathbf{x}$ relative to $\beta_2$, then $B[\mathbf{x}]_{\beta_2}=[\mathbf{x}]_{\beta_1}$ is the coordinate vector of $\mathbf{x}$ relative to $\beta_1$.

But now, consider what the matrix $BA$ does to the standard basis of $\mathbb{R}^n$ (or $\mathbf{F}^n$, in the general case): what is $BA\mathbf{e}_i$, where $\mathbf{e}_i$ is the vector that has a $1$ in the $i$th coordinate and zeros elsewhere? It's a matter of interpreting this correctly: $\mathbf{e}_i$ is the coordinate vector relative to $\beta_1$ of $\mathbf{v}_i$, because $[\mathbf{v}_i]={\beta_1}=\mathbf{e}_i$. Therefore, since $A[\mathbf{x}]_{\beta_1} = [\mathbf{x}]_{\beta_2}$ and $B[\mathbf{x}]_{\beta_2}=[\mathbf{x}]_{\beta_1}$ for every $\mathbf{x}$, we have: $$BA\mathbf{e}_i = B(A\mathbf{e}_i) = B(A[\mathbf{v}_i]_{\beta_1}) = B[\mathbf{v}_i]_{\beta_2} = [\mathbf{v}_i]_{\beta_1} = \mathbf{e}_i.$$ That is, $BA$ maps $\mathbf{e}_i$ to $\mathbf{e}_i$ for $i=1,\ldots,n$. The only way for this to happen is if $BA=I_n$ is the identity. The same argument, now interpreting $\mathbf{e}_i$ as $[\mathbf{w}_i]_{\beta_2}$, shows that $AB$ is also the identity.

So $A$ and $B$ are both invertible.

So every change-of-basis matrix is necessarily invertible.

It doesn't really matter if you are considering a subspace of $\mathbb{R}^N$, a vector space of polynomials or functions, or any other vector space. So long as it is finite dimensional (so that you can define the "change-of-basis" matrix), change-of-basis matrices are always invertible.

Added. I just saw the comment where you give the definition you have of change-of-basis matrix: a matrix $C$ which, when multiplied by a matrix $B$ whose columns form a basis of a certain subspace, produces another matrix $A$ whose columns form a basis for the same subspace.

This matrix $C$ is just the matrix that expresses the columns of $A$ in terms of the columns of $B$. That is, it's the change-of-basis matrix from "columns-of-A" coordinates to "columns-of-B" coordinates.

For example, take the subspace of $\mathbb{R}^4$ given by $x=z$ and $y=w$, with basis $$\left(\begin{array}{c}1\\0\\1\\0\end{array}\right),\quad\left(\begin{array}{c}0\\1\\0\\1\end{array}\right),$$ and now consider the same space, but with basis $$\left(\begin{array}{c}1\\1\\1\\1\end{array}\right),\quad \left(\begin{array}{r}1\\-2\\1\\-2\end{array}\right).$$ The matrix $C$ such that $$ \left(\begin{array}{rr} 1 & 1\\ 1 & -2\\ 1 & 1\\ 1 & -2 \end{array}\right) = C\left(\begin{array}{cc} 1 & 0\\ 0 & 1\\ 1 & 0\\ 0 & 1 \end{array}\right)$$ is obtained by writing each vector in the columns of $A$ in terms of the columns of $B$: \begin{align*} \left(\begin{array}{r} 1\\1\\1\\1\end{array}\right) &= 1\left(\begin{array}{c}1\\ 0\\ 1\\ 0\end{array}\right) + 1\left(\begin{array}{c}0 \\ 1 \\ 0 \\ 1\end{array}\right),\\ \left(\begin{array}{r} 1\\ -2\\ 1\\ -2\end{array}\right) &= 1\left(\begin{array}{c}1\\0\\1\\0\end{array}\right) -2\left(\begin{array}{c}0\\1\\0\\1\end{array}\right). \end{align*} And so, the matrix $C$ is $$C = \left(\begin{array}{rr} 1 & 1\\ 1 & -2 \end{array}\right).$$ Expressing the columns of $B$ in terms of the columns of $A$ give the inverse: \begin{align*} \left(\begin{array}{c}1\\ 0\\ 1\\ 0\end{array}\right) &= \frac{2}{3}\left(\begin{array}{c}1 \\ 1\\ 1\\ 1\end{array}\right) + \frac{1}{3}\left(\begin{array}{r}1 \\ -2\\ 1\\ -2\end{array}\right)\\ \left(\begin{array}{c}0\\ 1\\ 0\\ 1\end{array}\right) &= \frac{1}{3}\left(\begin{array}{c} 1\\ 1\\ 1\\ 1\end{array}\right) -\frac{1}{3}\left(\begin{array}{r}1\\ -2\\ 1\\ -2\end{array}\right), \end{align*} so the inverse of $C$ is: $$C^{-1} = \left(\begin{array}{rr} \frac{2}{3} & \frac{1}{3}\\ \frac{1}{3} & -\frac{1}{3} \end{array}\right),$$ which you can verify by multiplying by $C$.