Why is determinant a multilinear function?
I am trying to understand (intuitive explanation will be fine) why determinant is a multilinear function and therefore to learn how elementary row operation affect the determinant.
I understand that it has something to do with the definition of determinant by permutations, due to permutation being a bijection, in each product of the determinant there is just one entry from each row, but what's next?
Solution 1:
Consider a $2\times 2$ matrix $$ A=\left[\matrix{a_{11} & a_{12}\\ a_{21} & a_{22}}\right]. $$ Using the column notations $$ A_1=\left[\matrix{a_{11}\\ a_{21}}\right],\quad A_2=\left[\matrix{a_{12}\\ a_{22}}\right] $$ we can write $$ A=[A_1\ A_2], \qquad \det A=\det[A_1\ A_2]=f(A_1,A_2)=a_{11}a_{22}- a_{21}a_{12} $$ that is the determinant is a function of the matrix columns $A_1$ and $A_2$.
Let's see now what happens when we multiply one column, say the first one, with a number $\color{red}{\lambda}$ $$ f(\color{red}{\lambda}A_1,A_2)= \det\left[\matrix{\color{red}{\lambda}a_{11} & a_{12}\\ \color{red}{\lambda}a_{21} & a_{22}}\right]=\color{red}{\lambda}a_{11}a_{22}- \color{red}{\lambda}a_{21}a_{12}=\color{red}{\lambda}(a_{11}a_{22}- a_{21}a_{12})=\color{red}{\lambda}f(A_1,A_2). $$ Thus, to multiply one column with a number is the same as to multiply the whole function with this number.
Let's see now what happens when we have addition of two columns $$ f(\color{red}{A_1'}+\color{blue}{A_1''},A_2)= \det\left[\matrix{\color{red}{a_{11}'}+\color{blue}{a_{11}''} & a_{12}\\ \color{red}{a_{21}'}+\color{blue}{a_{21}''} & a_{22}}\right]= (\color{red}{a_{11}'}+\color{blue}{a_{11}''})a_{22}- (\color{red}{a_{21}'}+\color{blue}{a_{21}''})a_{12}=\\ =\color{red}{a_{11}'}a_{22}- \color{red}{a_{21}'}a_{12}+\color{blue}{a_{11}''}a_{22}- \color{blue}{a_{21}''}a_{12}=f(\color{red}{A_1'},A_2)+f(\color{blue}{A_1''},A_2). $$ Thus to add two columns in one and then calculate the determinant is the same as to first calculate determinants for each term separately while keeping the other columns unchanged and then to add the result.
Functions with such properties are called linear, however, the determinant is not linear with respect to the entire matrix $A$, it is only linear with respect to any particular column separately. That's why it is a multilinear function of the matrix columns. Similar can be said for the rows too. A generalization to the $n\times n$ case is straightforward.
Solution 2:
Multilinearity of the determinant follows from Cavalieri's principle applied to n-dimensional parallelipipeds.
The determinant of a matrix measures the (n-dimensional) volume of the parallelipiped generated by the columns of the matrix:
Multilinearity means that the determinant is a linear function in each column of the input matrix, independently. I.e.:
$$\det \left( \begin{bmatrix}{\color{purple}\lambda} \mathbf{v_1} & \mathbf{v_2} & \dots & \mathbf{v_n}\end{bmatrix} \right) = {\color{purple}\lambda}\det \left(\begin{bmatrix} \mathbf{v_1} & \mathbf{v_2} & \dots & \mathbf{v_n} \end{bmatrix} \right)$$
$$\det \left( \begin{bmatrix} \mathbf{\color{darkgreen} u} + \mathbf{\color{blue} w} & \mathbf{v_2} & \dots & \mathbf{v_n} \end{bmatrix} \right) = \det \left( \begin{bmatrix} \mathbf{\color{darkgreen} u} & \mathbf{v_2} & \dots & \mathbf{v_n} \end{bmatrix} \right) + \det \left( \begin{bmatrix} \mathbf{\color{blue} w} & \mathbf{v_2} & \dots & \mathbf{v_n} \end{bmatrix} \right),$$
and similar formulas must hold for the second, third, etc.. columns.
The first property (pulling out of scalars $\lambda$) is easy to see and already discussed in user2520938's answer. When you linearly scale a parallelipiped in a single direction, you increase its volume by the scaling factor:
To see that the second property holds (multilinearity under addition), translate the two parallelipipeds associated with the right hand side of 2. so that they share a lower dimensional parallelipiped as a common face (the parallelipiped defined by the shared vectors $\mathbf{v_2},\dots, \mathbf{v_n}$). All the slices of this combined object have the same shape, and these slices also have the same shape as the slices of the summed parallelipiped associated with the left hand side of 2. Hence by Cavalieri's principle the parallelipipeds associated with the left and right hand sides of 2. must have the same volume:
For intuition about Cavalieri's principle, just think about a stack of coins. If you take a straight stack of coins and shear it in any pattern, the volume stays the same (image credit for the coin stack to wikipedia):
Of course, the same argument holds when applied to any other column, hence determinant is multilinear in the columns of the input matrix.
Solution 3:
It is not linear, or more precisely it is linear only for matrices of size $1$.
For a matrix of size $n\times n$, the determinant, as a function of matrix columns, is multilinear.
If $A=[a_1,a_2,\dots, a_n]$, where $a_i$ are columns (with $n$ rows), then $$\det(\lambda_1a_1, a_2,\dots, a_n] = \lambda \det(A)$$
which is the definition of multilinearity.
Depending on your definition of determinant, the property can be proven in different ways.
If your definition is that the determinant is the sum (over all permutations) of products of elements in $A$, where you take one element from each column and each row, then it is obvious that in each sum (for each permutation) you take exactly one element, multiplied by $\lambda$.
Solution 4:
For a more 'intuitive' explanation then the one using the permutation definition you can consider the determinant as the formula for the area of a parallelogram, Parallelepiped, and higher order generalisations thereof. It is then obvious that when one scales one of the sides by a factor $\lambda$ that the area also scales with a factor $\lambda$.