Formal definition of $n$ by $0$ and $0$ by $n$ matrices

A matrix is usually informally defined as a rectangular array of numbers. To make this definition formal, we can define a matrix as a map from $\{1,...,m\} \times \{1,...,n\}$ to the underlying field of scalars, where $\times$ denotes cartesian product. However, a subtle complication arises when $m=0$ or $n=0$. In that case, the matrix would be an empty function. The problem, however, is that there is then no way to distinguish between $m \times 0$ matrices from $0 \times n$ matrices. In fact, under the cartesian product definition, for all natural numbers $m$, $m'$, $n$, and $n'$, the $m \times 0$, $m' \times 0$, $0 \times n$, and $0 \times n'$ matrices are all the same entity, namely the empty function. This is, to me, an undesirable state of affairs. I want to be able to distinguish, for example, $2 \times 0$, $3 \times 0$, $0 \times 2$, and $0 \times 3$ matrices. Is there a better definition of matrix that some mathematician has written about in some paper or book that avoids that problem?


Solution 1:

A $m\times n$ matrix is a representation of a mapping from a $n$-dimensional vector space to a $m$-dimensional vector space. In that sense, a $0\times m$ matrix is different from a $n\times 0$ matrix. While they both represent the mapping we call "the zero mapping", the zero mappings are different mappings.

In other words, instead of speaking of mappings from $\{1,\dots,m\}\times\{1,\dots,n\}$, you can speak of linear maps from $\mathbb F^n$ to $\mathbb F^m$ (usually denoted something like $\mathcal L(\mathbb F^n, \mathbb F^m)$), and instead of speaking of a $0\times m$ matrix, you can speak of the element of $\mathcal L(\mathbb F^0, \mathbb F^n)$. That element (there is only one) is different from the element of $\mathcal L(\mathbb F^m, \mathbb F^0)$.

Solution 2:

If $m$ and $n$ are positive integers, then an $m \times n$ matrix with entries in a non-empty set $\mathbb{F}$ (usually but not always a field) is often defined as a function of a 'row' index in the set $\{1, \ldots, m\}$ and a 'column' index in the set $\{1, \ldots, n\}.$

Such a 'function of two variables' invariably tends to be formalised in set theory as a function defined on the Cartesian product set $\{1, \ldots, m\} \times \{1, \ldots, n\}$ and taking values in the set $\mathbb{F}.$

But that is not the only way to do it, and doing it another way gives you the desired distinct concepts of $m \times 0$ and $0 \times n$ matrices, for all non-negative integers $m$ and $n.$

If $A$ and $B$ are sets, the set of all functions $A \to B$ is usually denoted by $B^A$ (although personally I can see no objection to simply writing the set as $A \to B,$ and I'm glad to see this notation used in the Wikipedia article on Currying). A function of two variables, taking its first argument from a set $A$ and its second argument from a set $B,$ and taking values in a set $C,$ is naturally represented as a function $A \to C^B.$

The usual set-theoretic representation of a function $f \colon X \to Y$ is as an ordered triple $(X, Y, G)$ (this in turn may be represented as an ordered pair $((X, Y), G),$ as in @Angel's answer), where $G$ is a set of ordered pairs $(x, y),$ such that $x \in X,$ $y \in Y,$ and there is exactly one value of $y$ for each value of $x \in X.$

Such a representation ensures that both the domain $X$ and the codomain $Y$ are part of the information contained in $f.$

In particular, the set $A$ is part of the information contained in a function $f \colon A \to C^B.$ Also, for any set $B$ and any non-empty set $C,$ the set $C^B$ is non-empty, i.e., there is always at least one function $B \to C.$ Given $f \colon A \to C^B,$ we know the codomain $C^B.$ Given any element of $C^B,$ i.e., any function $B \to C$ (it doesn't matter which one), we know its domain $B.$ Thus, unless the set $C$ is empty, both $A$ and $B$ are part of the information contained in a `function of two variables' $f \colon A \to C^B,$ regardless of whether either $A$ or $B$ is empty.

If $B$ is empty, then there is exactly one function $B \to C,$ therefore there is exactly one function $A \to C^B.$ In the ordered triple representation, it is: $$ (A, \{(\emptyset, C, \emptyset)\}, \{(x, (\emptyset, C, \emptyset)) : x \in A\}). $$

If $A$ is empty, then, more straightforwardly, there is also exactly one function $A \to C^B,$ and in the ordered triple representation, it is: $$ (\emptyset, C^B, \emptyset). $$

In particular, if an $m \times n$ matrix with values in a non-empty set $\mathbb{F}$ is represented as a function $$ \{1, \ldots, m\} \to \mathbb{F}^{\{1, \ldots, n\}}. $$ then for each $m \geqslant 0,$ there is a unique $m\times0$ matrix with entries in $\mathbb{F},$ and its standard set-theoretic representation is $$ (\{1, \ldots, m\}, \{(\emptyset, \mathbb{F}, \emptyset)\}, \{(i, (\emptyset, \mathbb{F}, \emptyset)) : 1 \leqslant i \leqslant m\}), $$ and for each $n \geqslant 0,$ there is a unique $0\times n$ matrix with entries in $\mathbb{F},$ and its standard set-theoretic representation is $$ (\emptyset,\mathbb{F}^{\{1,\ldots,n\}},\emptyset). $$