Why is the determinant defined in terms of permutations?

Where does the definition of the determinant come from, and is the definition in terms of permutations the first and basic one? What is the deep reason for giving such a definition in terms of permutations?

$$ \text{det}(A)=\sum_{p}\sigma(p)a_{1p_1}a_{2p_2}...a_{np_n}. $$

I have found this one useful:

Thomas Muir, Contributions to the History of Determinants 1900-1920.

This is only one of many possible definitions of the determinant.

A more "immediately meaningful" definition could be, for example, to define the determinant as the unique function on $\mathbb R^{n\times n}$ such that

The identity matrix has determinant $1$.
Every singular matrix has determinant $0$.
The determinant is linear in each column of the matrix separately.

(Or the same thing with rows instead of columns).

While this seems to connect to high-level properties of the determinant in a cleaner way, it is only half a definition because it requires you to prove that a function with these properties exists in the first place and is unique.

It is technically cleaner to choose the permutation-based definition because it is obvious that it defines something, and then afterwards prove that the thing it defines has all of the high-level properties we're really after.

The permutation-based definition is also very easy to generalize to settings where the matrix entries are not real numbers (e.g. matrices over a general commutative ring) -- in contrast, the characterization above does not generalize easily without a close study of whether our existence and uniqueness proofs will still work with a new scalar ring.

The amazing fact is that it seems matrices were developed to study determinants. I'm not sure, but I think the "formula" definition of the determinant you have there is known as the Leibnitz formula. I am going to quote some lines from the following source Tucker, 1993.:

Matrices and linear algebra did not grow out of the study of coefficients of systems of linear equations, as one might guess. Arrays of coefficients led mathematicians to develop determinants, not matrices. Leibnitz, co-inventor of calculus, used determinants in 1693 about one hundred and fifty years before the study of matrices in their own right. Cramer presented his determinant-based formula for solving systems of linear equations in 1750. The first implicit use of matrices occurred in Lagrange's work on bilinear forms in the late 18th century.

In 1848, J. J. Sylvester introduced the term "matrix," the Latin word for womb, as a name for an array of numbers. He used womb, because he viewed a matrix as a generator of determinants. That is, every subset of k rows and k columns in a matrix generated a determinant (associated with the submatrix formed by those rows and columns).

You would probably have to dig (historical texts, articles) to find out why exactly Leibnitz devised the definition, most probably he had some hunch/intuition that it could lead to some breakthroughs in understanding the underlying connection between coefficients and the solution of a system equations...

Hint:

Determinants appear in the solution of linear systems of equation, among others. If you permute the equations, the solution cannot change. Hence, the expression of a determinant must be insensitive to row permutations, and this is why they are a combination of terms involving $a_{ip_i}$.

This explains the pattern $$\sum_p \sigma_p\prod_i a_{ip_i},$$ where the operators are commutative and imply multilinearity of the expression. Also, the form must be antisymmetric so that two equal rows yield a zero determinant (causing failure of the solution) and this explains why $\sigma_p=\pm1$ indicates the parity of the permutation.

Here is a natural path to the idea of determinant (though this is not how they were originally developed).

An alternating $k$-linear function on a vector space $V$ over a field $\Bbb F$ is a map $f\,:\, V^k \to \Bbb F$ which is

Linear in each argument: $$f(v_1, \ldots, v_{i-1}, av_i + bw_i, v_{i+1}, \ldots, v_k) = af(v_1, \ldots, v_i, \ldots, v_k) + bf(v_1, \ldots, w_i, \ldots, v_k)$$ for all $i$.
Changes sign under exchange of any two arguments: $$f(v_1, \ldots, v_i, \ldots, v_j, \ldots v_k) = -f(v_1, \ldots, v_j, \ldots, v_i, \ldots v_k)$$ for all $i \ne j$

It is easy to see that if $f,g$ are two alternating $k$-linear functions on $V$, then so is $af + bg$ for any $a,b \in \Bbb F$, so the alternating $k$-linear functions on $V$ form another vector space $A^k(V)$. Some development shows that if $V$ has dimension $n$, then $A^k(V)$ has dimension $n \choose k$. In particular $A^n(V)$ has dimension $1$.

Now if $M\,:\,V \to V$ is linear and if $f\in A^k(V)$, then the map $$M_kf\,:\, V^k \to \Bbb F\,:\,(v_1, ... v_k) \mapsto f(Mv_1, ..., Mv_k)$$ is also alternating $k$-linear. And clearly $M_k(af+bg) = aM_kf + bM_kg$, so $M_k$ defines a linear map from $A^k(V)$ to itself (i.e., an endomorphism of $A^k(V)$).

Since $A^n(V)$ is one dimensional, any endomorphism is just multiplication by some element of the field $\Bbb F$. Thus we define the determinant of $M$ to be the unique element $\det(M) \in \Bbb F$ such that $$M_nf = \det(M)f\text{ for all }f \in A^n(V)$$

All the properties of determinants, including the permutation formula can be developed from this. Certain properties of determinants that are difficult to prove from the Liebnitz formula are almost trivial from this definition. In particular that $\det(MN) = \det(M)\det(N)$.

There is a close connection between the space of alternating $k$-linear functions and the $k$-order wedge product of a space, so I could have very similarly developed the determinant based on the wedge product, but alternating $k$-linear functions are easier conceptually.

Why is the determinant defined in terms of permutations?

Related

Recent Posts