Hahn-Banach From Systems of Linear Equations
In this paper1 on the history of functional analysis, the author mentions the following example of an infinite system of linear equations in an infinite number of variables $c_i = A_{ij} x_j$:
\begin{align*} \begin{array}{ccccccccc} 1 & = & x_1 & + & x_2 & + & x_3 & + & \dots \\ 1 & = & & & x_2 & + & x_3 & + & \dots \\ 1 & = & & & & & x_3 & + & \dots \\ & \vdots & & & & & & & \ddots \end{array} \to \begin{bmatrix} 1 \\ 1 \\ 1 \\ \vdots \end{bmatrix} = \begin{bmatrix} 1 & 1 & 1 & \dots \\ & 1 & 1 & \dots \\ & & 1 & \dots \\ & & & \ddots \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ \vdots \end{bmatrix} \end{align*}
as an example of a system such that any finite truncation of the system down to an $n \times n$ system has a unique solution $x_1 = \dots = x_{n=1} = 0, x_n = 1$ but for which the full system has no solution.
This book2 has the following passage on systems such as this one:
The Hahn-Banach theorem arose from attempts to solve infinite systems of linear equations... The key to the solvability is determining "compatibility" of the system of equations. For example, the system $x + y = 2$ and $x + y = 4$ cannot be solved because it requires contradictory things and so are "incompatible". The first attempts to determine compatibility for infinite systems of linear equations extended known determinant and row-reduction techniques. It was a classical analysis - almost solve the problem in a finite situation, then take a limit. A fatal defect of these approaches was the need for the (very rare) convergence of infinite products."
and then mentions a theorem about these systems that motivates Hahn-Banach:
Theorem 7.10.1 shows that to solve a certain system of linear equations, it is necessary and sufficient that a continuity-type condition be satisfied.
Theorem 7.10.1 (The Functional Problem): Let $X$ be a normed space over $\mathbb{F} = \mathbb{R}$ or $\mathbb{C}$, let $\{x_s \ : \ s \in S \}$ and $\{ c_s \ : \ s \in S \}$ be sets of vectors and scalars, respectively. Then there is a continuous linear functional $f$ on $X$ such that $f(x_s) = c_s$ for each $s \in S$ iff there exists $K > 0$ such that \begin{equation} \left|\sum_{s \in S} a_s c_s \right| \leq K \left\| \sum_{s \in S} a_s x_S \right\| \tag{1}, \end{equation} for any choice of scalars $\{a_s \ : \ s \in S \}$ for which $a_s = 0$ for all but finitely many $s \in S$ ("almost all" the $a_s = 0$).
Banach used the Hahn-Banach theorem to prove Theorem 7.10.1 but Theorem 7.10.1 implies the Hahn-Banach theorem: Assuming that Theorem 7.10.1 holds, let $\{ x_s \}$ be the vectors of a subspace $M$, let $f$ be a continuous linear functional on $M$; for each $s \in S$, let $c_s = f(x_s)$. Since $f$ is continuous, $(1)$ is satisfied and $f$ possesses a continuous extension to $X$.
My question is:
If you knew none of the theorems just mentioned, how would one begin from the system $c_i = A_{ij} x_j$ at the beginning of this post and think of setting up the conditions of theorem 7.10.1 as a way to test whether this system has a solution?
How does this test show the system has no solution?
How do we re-formulate this process as though we were applying the Hahn-Banach theorem?
Does anybody know of a reference for the classical analysis of systems in terms of infinite products?
1Neal L. Carothers: A Brief History of Functional Analysis.
2Lawrence Narici, Edward Beckenstein: Topological Vector Spaces, 2nd Edition.
I will try to provide some rather incomplete answers to your questions.
- Let us for simplicity, consider the finite case. Let us say we want to solve a $n\times n$ linear system of equations, i.e $Ax = b$, with all quantities in question being real. That is we are solving: $$ \begin{matrix} a_{11}x_{11} +\cdots + a_{1n}x_{1n} = b_1 \\ \vdots \\ a_{n1}x_{n1} +\cdots + a_{nn}x_{nn} = b_n \end{matrix} $$ Recall that $\mathbb{R}^n$ is self dual, with any linear functional (all which are continuous) in $\mathbb{R}^n$ given by $f_a(x) = \langle a , x\rangle = \sum_{k =1}^n a_kx_k$. Thus, in the case we consider here, solving a linear system amounts to finding a solution to: $$ f_{a_1}(x) = b_1 \\ \vdots \\ f_{a_n}(x) = b_n $$ Where $a_i = (a_{i1},\cdots a_{in}$). Thus, solving a simultaneous system of linear equations is equivalent to finding $x \in \mathbb{R}^n$ solving the above vector problem. Because $\mathbb{R}^n$ is reflexive, this is equivalent to picking $x \in \left(\mathbb{R}^n\right)^*$, and then solving the equation if given functionals $f_{a_1}, \cdots , f_{a_n}$. With this framework in mind, it is useful to discuss when inconsistency occurs in a linear system. Let $X$ be a vector space over $\mathbb{R}$ and let $f \in X^*$, $\{x_s: s\in S\}\subset X$ and $c_s = f(x_s)$ for all $s$. Let $M = \mathrm{span}(x_s : s \in S)$. If the $x_s$ are linearly independent there are no issues. For any finite collection $x_{s_1}, \cdots x_{s_n}$, we can define $f(x_{s_k}) = c_{s_k}$, and then extend this to $M$ by linearity. Everything is well defined on all of $M$, thanks to the uniqueness of representation given by linear dependence. However, we may have issues when the $x_s$ are not linearly independent. Specifically, let us say we have $x\in M$, with two representations: $$ x = \sum_{s \in J} \alpha_s x_s \\ x = \sum_{t \in I} \beta_t x_t $$ where $I,J \subset S$, finite. (The above sum can be indexed by new set, say $K = I \cup J$, and with scalars indexed by things not in $I\cap J$, we set them equal to $0$) We have: $$ f(x) = \sum_{k \in K}\alpha_kc_k = \sum_{k \in K}\beta_kc_k = f(x) $$ Thus we must have: $$ \left| \sum_{k \in J}(\alpha_k - \beta_k)c_k\right| = 0 $$ Now, we have no guarantees that this is true, as the values of $c_k$ are chosen a-priori.Thus, we need to constrain $f$ in some fashion in order to ensure well-definedness. The condition that gives us this is the specification: $$ \left|\sum_{k \in K}\gamma_kc_k \right | \leq K \left\|\sum_{k \in K} \gamma_k x_k\right\| $$ for any finite set of scalars $\gamma_k$. In this case, we set $\gamma_k = \alpha_k - \beta_k$. In finite dimensions, this $K$ always exists, as we can express all vectors in terms of a common basis. In the general case, this cannot be done. Thus, in summary, the only condition which gives us well-definedness/consistency is indeed the above condition.
The above result also implies Hahn Banach as the author says.