In this paper1 on the history of functional analysis, the author mentions the following example of an infinite system of linear equations in an infinite number of variables $c_i = A_{ij} x_j$:

\begin{align*} \begin{array}{ccccccccc} 1 & = & x_1 & + & x_2 & + & x_3 & + & \dots \\ 1 & = & & & x_2 & + & x_3 & + & \dots \\ 1 & = & & & & & x_3 & + & \dots \\ & \vdots & & & & & & & \ddots \end{array} \to \begin{bmatrix} 1 \\ 1 \\ 1 \\ \vdots \end{bmatrix} = \begin{bmatrix} 1 & 1 & 1 & \dots \\ & 1 & 1 & \dots \\ & & 1 & \dots \\ & & & \ddots \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ \vdots \end{bmatrix} \end{align*}

as an example of a system such that any finite truncation of the system down to an $n \times n$ system has a unique solution $x_1 = \dots = x_{n=1} = 0, x_n = 1$ but for which the full system has no solution.

This book2 has the following passage on systems such as this one:

The Hahn-Banach theorem arose from attempts to solve infinite systems of linear equations... The key to the solvability is determining "compatibility" of the system of equations. For example, the system $x + y = 2$ and $x + y = 4$ cannot be solved because it requires contradictory things and so are "incompatible". The first attempts to determine compatibility for infinite systems of linear equations extended known determinant and row-reduction techniques. It was a classical analysis - almost solve the problem in a finite situation, then take a limit. A fatal defect of these approaches was the need for the (very rare) convergence of infinite products."

and then mentions a theorem about these systems that motivates Hahn-Banach:

Theorem 7.10.1 shows that to solve a certain system of linear equations, it is necessary and sufficient that a continuity-type condition be satisfied.

Theorem 7.10.1 (The Functional Problem): Let $X$ be a normed space over $\mathbb{F} = \mathbb{R}$ or $\mathbb{C}$, let $\{x_s \ : \ s \in S \}$ and $\{ c_s \ : \ s \in S \}$ be sets of vectors and scalars, respectively. Then there is a continuous linear functional $f$ on $X$ such that $f(x_s) = c_s$ for each $s \in S$ iff there exists $K > 0$ such that \begin{equation} \left|\sum_{s \in S} a_s c_s \right| \leq K \left\| \sum_{s \in S} a_s x_S \right\| \tag{1}, \end{equation} for any choice of scalars $\{a_s \ : \ s \in S \}$ for which $a_s = 0$ for all but finitely many $s \in S$ ("almost all" the $a_s = 0$).

Banach used the Hahn-Banach theorem to prove Theorem 7.10.1 but Theorem 7.10.1 implies the Hahn-Banach theorem: Assuming that Theorem 7.10.1 holds, let $\{ x_s \}$ be the vectors of a subspace $M$, let $f$ be a continuous linear functional on $M$; for each $s \in S$, let $c_s = f(x_s)$. Since $f$ is continuous, $(1)$ is satisfied and $f$ possesses a continuous extension to $X$.

My question is:

  1. If you knew none of the theorems just mentioned, how would one begin from the system $c_i = A_{ij} x_j$ at the beginning of this post and think of setting up the conditions of theorem 7.10.1 as a way to test whether this system has a solution?

  2. How does this test show the system has no solution?

  3. How do we re-formulate this process as though we were applying the Hahn-Banach theorem?

  4. Does anybody know of a reference for the classical analysis of systems in terms of infinite products?


1Neal L. Carothers: A Brief History of Functional Analysis.
2Lawrence Narici, Edward Beckenstein: Topological Vector Spaces, 2nd Edition.


I will try to provide some rather incomplete answers to your questions.

  1. Let us for simplicity, consider the finite case. Let us say we want to solve a $n\times n$ linear system of equations, i.e $Ax = b$, with all quantities in question being real. That is we are solving: $$ \begin{matrix} a_{11}x_{11} +\cdots + a_{1n}x_{1n} = b_1 \\ \vdots \\ a_{n1}x_{n1} +\cdots + a_{nn}x_{nn} = b_n \end{matrix} $$ Recall that $\mathbb{R}^n$ is self dual, with any linear functional (all which are continuous) in $\mathbb{R}^n$ given by $f_a(x) = \langle a , x\rangle = \sum_{k =1}^n a_kx_k$. Thus, in the case we consider here, solving a linear system amounts to finding a solution to: $$ f_{a_1}(x) = b_1 \\ \vdots \\ f_{a_n}(x) = b_n $$ Where $a_i = (a_{i1},\cdots a_{in}$). Thus, solving a simultaneous system of linear equations is equivalent to finding $x \in \mathbb{R}^n$ solving the above vector problem. Because $\mathbb{R}^n$ is reflexive, this is equivalent to picking $x \in \left(\mathbb{R}^n\right)^*$, and then solving the equation if given functionals $f_{a_1}, \cdots , f_{a_n}$. With this framework in mind, it is useful to discuss when inconsistency occurs in a linear system. Let $X$ be a vector space over $\mathbb{R}$ and let $f \in X^*$, $\{x_s: s\in S\}\subset X$ and $c_s = f(x_s)$ for all $s$. Let $M = \mathrm{span}(x_s : s \in S)$. If the $x_s$ are linearly independent there are no issues. For any finite collection $x_{s_1}, \cdots x_{s_n}$, we can define $f(x_{s_k}) = c_{s_k}$, and then extend this to $M$ by linearity. Everything is well defined on all of $M$, thanks to the uniqueness of representation given by linear dependence. However, we may have issues when the $x_s$ are not linearly independent. Specifically, let us say we have $x\in M$, with two representations: $$ x = \sum_{s \in J} \alpha_s x_s \\ x = \sum_{t \in I} \beta_t x_t $$ where $I,J \subset S$, finite. (The above sum can be indexed by new set, say $K = I \cup J$, and with scalars indexed by things not in $I\cap J$, we set them equal to $0$) We have: $$ f(x) = \sum_{k \in K}\alpha_kc_k = \sum_{k \in K}\beta_kc_k = f(x) $$ Thus we must have: $$ \left| \sum_{k \in J}(\alpha_k - \beta_k)c_k\right| = 0 $$ Now, we have no guarantees that this is true, as the values of $c_k$ are chosen a-priori.Thus, we need to constrain $f$ in some fashion in order to ensure well-definedness. The condition that gives us this is the specification: $$ \left|\sum_{k \in K}\gamma_kc_k \right | \leq K \left\|\sum_{k \in K} \gamma_k x_k\right\| $$ for any finite set of scalars $\gamma_k$. In this case, we set $\gamma_k = \alpha_k - \beta_k$. In finite dimensions, this $K$ always exists, as we can express all vectors in terms of a common basis. In the general case, this cannot be done. Thus, in summary, the only condition which gives us well-definedness/consistency is indeed the above condition.

The above result also implies Hahn Banach as the author says.