Proof of Leibniz formula from Laplace expansion

I'm trying to prove Leibniz formula for the determinant using Laplace expansion. Here's my attempt:

For a $1 \times 1$ matrix $A = \begin{pmatrix}a_{11}\end{pmatrix}$, define $\det A = a_{11}$. For any $n \times n$ matrix $A$, define $\det A$ recursively according to the formula $$ \det A = \sum_{j=1}^n [A]_{1j} (-1)^{j+1}\det A_{1j} $$ where $A_{ij}$ is the matrix obtained from deleting the $i$-th row and $j$-th column of $A$, and $[A]_{ij}$ is the $i,j$-th entry of $A$.

Claim: For any finite set $A$, let $P(A)$ denote the set of all permutations of the elements of $A$.
Then $$ \det A = \sum_{\sigma \in P(\{1,\cdots, n\})} \left(\text{sgn} \, \sigma \prod_{i=1}^n [A]_{i \sigma(i)} \right) $$ where $\text{sgn} \, \sigma$ is the sign of the permutation $\sigma = (\sigma(1), \cdots, \sigma(n))$.

Proof: The claim is trivial when $n = 1$.

Suppose the claim is true for all $n \times n$ matrices. Let $A$ be an $(n+1) \times (n+1)$ matrix. Then \begin{align} \det A &= \sum_{j=1}^{n+1} [A]_{1j} (-1)^{j+1}\det A_{1j} \\ &= \sum_{j=1}^{n+1} [A]_{1j} (-1)^{j+1}\sum_{\sigma \in P(\{1,\cdots, n\})} \left(\text{sgn} \, \sigma \prod_{i=1}^n [A_{1j}]_{i \sigma(i)} \right) \\ &= \sum_{j=1}^{n+1} \sum_{\sigma \in P(\{1,\cdots, n\})} (-1)^{j+1} \text{sgn} \, \sigma \cdot \left(\,[A]_{1j} \prod_{i=1}^n [A_{1j}]_{i \sigma(i)} \right) \end{align} where the second equality follows from the induction hypothesis.

For each $\sigma \in P(\{1,\cdots, n\})$, define a permutation $\sigma^j \in P(\{1,\cdots, j-1,j+1,\cdots,n+1\})$ where the $i$-th element of $\sigma^j$ is the $\sigma(i)$-th element of the list $\{1,\cdots, j-1,j+1,\cdots,n+1\}$. Also define $\rho^j = (j,\sigma^j) \in P(\{1, \cdots, n+1\})$. (The dependence of $\sigma^j$ and $\rho^j$ on $\sigma$ is suppressed for notational clarity.) Observe that it takes $j - 1$ transpositions to turn the list $(1, \cdots, n+1)$ into $(j,1 ,\cdots, j-1, j+1, \cdots, n+1\}$ and then $\text{sgn } \sigma$ more transpositions to turn this list into $(j, \sigma^j)$. Thus $\text{sgn } \rho^j = (-1)^{j-1} \text{sgn } \sigma = (-1)^{j+1} \text{sgn } \sigma$. Also observe that $[A_{1j}]_{i \sigma(i)} = [A]_{i+1,\sigma^j(i)}$. Plugging into the previous expression, we have

\begin{align} \det A &= \sum_{j=1}^{n+1} \sum_{\sigma \in P(\{1,\cdots, n\})} \text{sgn} \, \rho^j \cdot \left(\, [A]_{1j} \prod_{i=1}^n [A]_{i+1, \sigma^j(i)} \right) \\ &= \sum_{j=1}^{n+1} \sum_{\sigma \in P(\{1,\cdots, n\})} \text{sgn} \, \rho^j \cdot \left(\, \prod_{i=1}^{n+1} [A]_{i \rho^j(i)} \right) \end{align} The result then follows because every $\rho \in P(\{1, \cdots, n+1\})$ can be uniquely written as $\rho = (j, \sigma^j)$ for some $j$ and $\sigma$.

Does this make sense/easy to follow? Is there a way to be more tidy/elegant? Thanks!


Solution 1:

This is my proof without defining new notations.

  • Continuing from the induction hypothesis $$\det{A} =\sum_{j=1}^{n+1}(-1)^{1+j}[A]_{1,j}\det{A_{1,j}} =\sum_{j=1}^{n+1}(-1)^{1+j}[A]_{1,j}\sum_{\sigma\in S_n}\text{sgn }\sigma\prod_{i=1}^{n}[A_{1,j}]_{i,\sigma(i)}$$

  • Denote $[n]=\{1, 2, ..., n\}$. For any $\sigma\in S_n$, since $\sigma$ is bijective, let $$i_1=\sigma^{-1}(1), i_2=\sigma^{-1}(2), ..., i_n=\sigma^{-1}(n).$$ Then $\{i_1, i_2, ..., i_n\}=[n]$ and $$\sigma(i_1)=1, \sigma(i_2)=2, ..., \sigma(i_n)=n.$$ As the following figure indicates. $$\begin{matrix} [n] & \sigma\in S_n & [n]\\ \hline i_1 & \longrightarrow & 1 \\ i_2 & \longrightarrow & 2 \\ \vdots & \vdots & \vdots \\ i_n & \longrightarrow & n \\ \end{matrix}$$

  • Then \begin{eqnarray*} \det{A} &=& \sum_{j=1}^{n+1}(-1)^{1+j}[A]_{1,j}\sum_{\sigma\in S_n}\text{sgn }\sigma\prod_{i=1}^{n}[A_{1,j}]_{i,\sigma(i)}\\ &=& \sum_{j=1}^{n+1}(-1)^{1+j}[A]_{1,j}\sum_{\sigma\in S_n}\text{sgn }\sigma\prod_{k=1}^{n}[A_{1,j}]_{i_k, \sigma(i_k)}\\ &=& \sum_{j=1}^{n+1}(-1)^{1+j}[A]_{1,j}\sum_{\sigma\in S_n}\text{sgn }\sigma\prod_{k=1}^{n}[A_{1,j}]_{i_k, k}\\ &=& \sum_{j=1}^{n+1}(-1)^{1+j}[A]_{1,j}\sum_{\sigma\in S_n}\text{sgn }\sigma\left(\prod_{k=1}^{j-1}[A_{1,j}]_{i_k, k}\prod_{k=j}^{n}[A_{1,j}]_{i_k, k}\right)\\ &=& \sum_{j=1}^{n+1}(-1)^{1+j}[A]_{1,j}\sum_{\sigma\in S_n}\text{sgn }\sigma\left(\prod_{k=1}^{j-1}[A]_{i_k+1, k}\prod_{k=j}^{n}[A]_{i_k+1, k+1}\right)\\ &=& \sum_{j=1}^{n+1}(-1)^{1+j}\sum_{\sigma\in S_n}\text{sgn }\sigma\left([A]_{1,j}\cdot \prod_{k=1}^{j-1}[A]_{i_k+1, k}\prod_{k=j}^{n}[A]_{i_k+1, k+1}\right)\\ &=& \sum_{j=1}^{n+1}(-1)^{1+j}\sum_{\sigma\in S_n}\text{sgn }\sigma \cdot [A]_{1,j}\cdot \underline{[A]_{i_1+1, 1}\cdot [A]_{i_2+1, 2}\cdots [A]_{i_{j-1}+1, j-1}}\cdot \\ && \underline{[A]_{i_j+1, j+1}\cdot [A]_{i_{j+1}+1, j+2}\cdots [A]_{i_n+1, n+1}}\\ &=& \sum_{j=1}^{n+1}(-1)^{1+j}\sum_{\sigma\in S_n}\text{sgn }\sigma \cdot \underline{[A]_{i_1+1, 1}\cdot [A]_{i_2+1, 2}\cdots [A]_{i_{j-1}+1, j-1}}\cdot \\ && [A]_{1,j}\cdot \underline{[A]_{i_j+1, j+1}\cdot [A]_{i_{j+1}+1, j+2}\cdots [A]_{i_n+1, n+1}}\\ \end{eqnarray*}

  • Consider a permutation $\tau_{\sigma}\in S_{n+1}$ as following $$\begin{matrix} [n+1] & \tau_{\sigma}\in S_{n+1} & [n+1]\\ \hline i_1+1 & \longrightarrow & 1 \\ i_2+1 & \longrightarrow & 2 \\ \vdots & \vdots & \vdots \\ i_{j-1}+1 & \longrightarrow & j-1 \\ 1 & \longrightarrow & j \\ i_j+1 & \longrightarrow & j+1 \\ \vdots & \vdots & \vdots \\ i_n+1 & \longrightarrow & n+1 \\ \end{matrix}$$ Then the equation above equals to $$\det{A}=\sum_{j=1}^{n+1}(-1)^{1+j}\sum_{\sigma\in S_n}\text{sgn }\sigma\prod_{\ell=1}^{n+1}[A]_{\ell, \tau_{\sigma}(\ell)}.$$

  • Note that there is an one-to-one correspondence between $\sigma\in S_n$ and $\tau_{\sigma}\in S_{n+1}$ with $\tau_{\sigma}(1)=j$. By the Lemma 2, $\text{sgn }\tau_{\sigma}=(-1)^{1+j}\text{sgn }\sigma$. Then $$\det{A}=\sum_{j=1}^{n+1}\sum_{\substack{\tau\in S_{n+1}\\ \tau(1)=j}}\text{sgn }\tau\prod_{\ell=1}^{n+1}[A]_{\ell, \tau(\ell)} =\sum_{\tau\in S_{n+1}}\text{sgn }\tau\prod_{\ell=1}^{n+1}[A]_{\ell, \tau(\ell)}.$$

Lemma 1. If $\gamma\in S_{n+1}$ is $$\begin{matrix} [n+1] & \gamma\in S_{n+1} & [n+1]\\ \hline 1 & \longrightarrow & x_1 \\ 2 & \longrightarrow & x_2 \\ \vdots & \vdots & \vdots \\ i & \longrightarrow & x_{i}\\ i+1 & \longrightarrow & x_{i+1}\\ \vdots & \vdots & \vdots \\ n+1 & \longrightarrow & x_{n+1}\\ \end{matrix}$$ Then $(x_i, x_{i+1})\gamma$ is $$\begin{matrix} [n+1] & \gamma\in S_{n+1} & [n+1]\\ \hline 1 & \longrightarrow & x_1 \\ 2 & \longrightarrow & x_2 \\ \vdots & \vdots & \vdots \\ i & \longrightarrow & x_{i+1}\\ i+1 & \longrightarrow & x_{i}\\ \vdots & \vdots & \vdots \\ n & \longrightarrow & x_n \\ \end{matrix}$$

Lemma 2. Back to our $\sigma$. Consider $\sigma^{-1}\in S_n$. We can define $\sigma^{-1}(n+1)=n+1$ to make it as an element in $S_{n+1}$. That is, $$\begin{matrix} [n+1] & \sigma^{-1}\in S_{n+1} & [n+1]\\ \hline 1 & \longrightarrow & i_1 \\ 2 & \longrightarrow & i_2 \\ \vdots & \vdots & \vdots \\ n & \longrightarrow & i_n \\ n+1 & \longrightarrow & n+1 \end{matrix}$$

By the Lemma 1, we can left-multiply a product of $m$ transpositions to make $i_1, i_2, ..., i_n, n+1$ in the right column in an increasing order. In fact, the product of these transpositions is $\sigma$.

Again, applying the Lemma 1 on $\tau_{\sigma}^{-1}\in S_{n+1}$ in the same way. We can left-multiply $j-1$ transpositions to $\tau_{\sigma}^{-1}$ to move $1$ to the first element in the right column. Then left-multiply $m$ transpositions to make $i_1+1, i_2+1, ..., i_n+1$ in the right column into an increasing order.

$$\begin{matrix} [n+1] & \tau_{\sigma}^{-1}\in S_{n+1} & [n+1]\\ \hline 1 & \longrightarrow & i_1+1 \\ 2 & \longrightarrow & i_2+1 \\ \vdots & \vdots & \vdots \\ j-1 & \longrightarrow & i_{j-1}+1 \\ j & \longrightarrow & 1 \\ j+1 & \longrightarrow & i_j+1 \\ \vdots & \vdots & \vdots \\ n+1 & \longrightarrow & i_n+1 \\ \end{matrix}$$ Suppose that $s_m \cdots s_2 s_1 t_{j-1} \cdots t_2 t_1\tau_{\sigma}^{-1}=r_m\cdots r_2 r_1\sigma^{-1}=\varepsilon$, where $s_m, ..., s_2, s_1, t_{j-1}, ..., t_2, t_1, r_m, ..., r_2, r_1$ all are transpositions and $\varepsilon$ is the identity in $S_{n+1}$. Therefore, \begin{eqnarray*} &&\text{sgn }(s_m \cdots s_2 s_1 t_{j-1} \cdots t_2 t_1\tau_{\sigma}^{-1})=\text{sgn }(r_m\cdots r_2 r_1\sigma^{-1})\\ &\Rightarrow& \text{sgn }(s_m \cdots s_2 s_1)\cdot \text{sgn }(t_{j-1} \cdots t_2 t_1)\cdot \text{sgn }(\tau_{\sigma}^{-1})=\text{sgn }(r_m\cdots r_2 r_1)\cdot \text{sgn }(\sigma^{-1})\\ &\Rightarrow& (-1)^{m}(-1)^{j-1}\text{sgn }(\tau_{\sigma}^{-1})=(-1)^{m}\text{sgn }(\sigma^{-1})\\ &\Rightarrow& (-1)^{j-1}\text{sgn }(\tau_{\sigma}^{-1})=\text{sgn }(\sigma^{-1})\\ &\Rightarrow& (-1)^{j-1}\text{sgn }(\tau_{\sigma})=\text{sgn }(\sigma)\\ &\Rightarrow& (-1)^{1+j}\text{sgn }(\tau_{\sigma})=\text{sgn }(\sigma) \end{eqnarray*}