Why do we need the Hahn-Banach Theorem to extend a bounded linear functional?

Solution 1:

I think what many people have in mind is the following. Suppose $X$ is a normed vector space, $Y$ is a subspace of $X$, and $f$ is a bounded linear functional on $Y$. Then $Y$ has an algebraic supplement $Z$, that is $X=Y+Z$. (Of course, the supplement may not be topological. Hence, the "$+$" sign.) We want to define $f$ on $X$ such that $f(x)=f(y+z)=f(y)$, that is to require $f$ maps $Z$ to $0$. This is definitely linear, but may not be bounded in infinite dimensional spaces. What could happen is $\|y+z\|$ may be small, but $f(y+z)=f(y)$ may be large. This is in fact the key point in the proof of the Hahn Banach Theorem (the extension one): In extending $f|_Y$ to $f|_{Y+\mathbb{R}\cdot z}$, the selection of $f(z)$ is crucial.

Solution 2:

As others have said, you cannot extend your function like that because in general you wouldn't get a linear map.

What I would like to add is that in a finite dimensional space it is usually really easy to extend the function by hand, but you need Hahn-Banach for infinite-dimensional spaces to get the existence of the extension, as normally it is impossible to find an explicit extension.

Solution 3:

This question has indeed already been answered, in particular by Xiang Li who does consider the "trivial LINEAR extension", as is also done here but let me fill in some more details.

So, even with that linear extension, the OP's question is still relevant, why does this not automatically work? This other question does give a precise condition for the extension: let us introduce some notations, $V$ our vector space, $U$ a subspace and $\varphi: U \to \mathbb{R}$ the linear map we want to extend in such a way that it is bounded by a homogeneous subadditive function $p: V \to \mathbb{R}\ $ (i.e. $\forall\, \lambda \geq 0,\ x\in V,\; p(\lambda x) = \lambda p(x)$ and $ p(x_1 + x_2)\leq p(x_1) + p(x_2)$. In general, $V$ needs not be a normed or topological vector space, and $p$ does not necessarily have positive value, but this condition replaces what the OP calls "bounded".).

It suffices to consider an extension to $\operatorname{Span}(U, z)= U \oplus \mathbb{R} z$. Saying that we want to "extend by $0$" should mean that we define $\tilde{\varphi} : U \oplus \mathbb{R} z \longrightarrow \mathbb{R},\enspace y + z \longmapsto \varphi(y)$. In the above mentionned question, they consider a general linear extension $\tilde{\varphi}_c: U \oplus \mathbb{R} z \to \mathbb{R},\enspace y + z \mapsto \varphi(y) + c$ and hence by linearity $\varphi(y + \lambda z) = \varphi(y) + \varphi(\lambda z)= \varphi(y) + \lambda c $ and derive the conditions on the possible $c$. Let us give two easy examples where $c$ cannot be $0$.

  1. Let me repeat that in the first version mentionned in wikipedia (with refs Rudin 1991, Th. 3.2, but this is also what one finds as the first theorem in the book "Functional Analysis, Sobolev Spaces and PDEs" (2011) by Haim Brezis), $p:V \to \mathbb{R}$ is not required to be positive. Let $V:= \mathbb{R}^2, U:= \mathbb{R}\times \lbrace 0 \rbrace$ and $\varphi: U \to \mathbb{R},\ (x,0)\mapsto x c_1$ (an equivalent way to write this would be $(\lambda, 0) \mapsto \lambda \varphi(1,0)$). Let $p:V \to \mathbb{R}$ be the linear map $(x,y)\mapsto x c_1 + y c_2$ where $c_1, c_2 \in \mathbb{R}$ and $c_1$ precisely the constant $\varphi(1,0)$. The advantage of being in dimension $2$ is that one can imagine the graph of $p$, which is just a plane with slope $c_1$ in the $x$-axis and $c_2$ in the $y$-axis. Extending $\varphi$ by $0$ in the $y$-axis will yield a function $\tilde{\varphi}$ whose graph is also a plane, but with slope $0$ in the $y$ axis. As soon as $c_2\neq 0$, a domination $\tilde{\varphi}(x,y) \leq p(x,y)$ for all $(x,y)\in \mathbb{R}^2$ cannot be satisfied: for a fixed $x$, observe the behavior in $y$...
  2. Let us now consider another example in order to make explicit what Xiang Li says, namely that "$\lVert y+z \rVert$ may be small, but $\tilde{\varphi}(y+z):= \varphi(y)$ may be large". Let $(V,\lVert \cdot \rVert)$ be this time a normed vector space and $p$ be the following positive valued subadditive function $p(\mathbf{x}):= \lvert c_1\rvert \lVert\mathbf{x}\rVert$. In fact, let $V:=\mathbb{R}^2$ and $\lVert (x,y) \rVert := \sqrt{x^2 + y^2}$ the Euclidean norm. Let $U:= \mathbb{R}\times \lbrace 0 \rbrace$ and $\varphi: U \to \mathbb{R},\ (x,0)\mapsto x c_1$. Extending $\varphi$ by $0$ in the $y$-axis will work, but one sees that we are using the extra Euclidean structure to select this axis as the orthogonal to $U$. If one had picked an arbitrary vector, e.g. $\mathbf{z}:=(-10,1) \in V\backslash U =\mathbb{R}^2 \backslash \mathbb{R}\times \lbrace 0\rbrace$ and extended $\varphi$ by $0$ in this direction, i.e. $\tilde{\varphi}: \operatorname{Span}(U,z) \to \mathbb{R}, (x,0) + \lambda \mathbf{z} \mapsto x c_1 + \lambda \times 0 = x c_1$ then $$\tilde{\varphi}(0,1) = \tilde{\varphi}\big( (10,0) + \mathbf{z}\big) = \varphi(10,0)= 10 c_1$$ One does have (2nd version of Hahn-Banach) $\lvert \varphi(x,0)\rvert \leq \lvert p(x,0)\rvert$ on the subspace $U$ but not outside since $\lvert \varphi(0,1)\rvert = 10 \lvert c_1 \rvert$ while $ \lvert p(0,1)\rvert= \lvert c_1 \rvert$.

Geometric interpretation: A linear form always vanishes on a hyperplane (rank theorem), namely, on its kernel so one does in fact extend the linear map $\varphi: U \to \mathbb{R}$ by $0$, but on A SPECIFIC supplementary subspace. If one chooses an arbitrary supplementary to $U$, the correponding extension is not by $0$.

There is indeed a one-to-one correspondence between the choice of an extension $\tilde{\varphi}: V \to \mathbb{R}$ and the choice of a supplementary $W$ on which one extends by $0$: a linear form is defined up to a factor by its kernel. One can understand this statement as the fact that the whole space is "foliated" and the "leaves" are the level sets of the linear form. A linear map $\psi: V \to\mathbb{R}$ is completely determined by $\operatorname{Ker}(\psi)$ and its value $\psi(\mathbf{x}_0)$ at some point $\mathbf{x}_0$ where $\psi(\mathbf{x}_0)\neq 0$. (From another viewpoint, this is the extension of $\varphi: \mathbb{R}\mathbf{x}_0 \to \mathbb{R}$ to $\psi: V \to\mathbb{R}$). And when one wants to extend a linear form $\varphi: U \to \mathbb{R}$ by the choice of a $W$ such that $V=U\oplus W$, this amounts to defining the extension $\tilde{\varphi}: V \to \mathbb{R}$ by setting $\tilde{\varphi}(\mathbf{x}_0) := \varphi(\mathbf{x}_0)$ on a chosen vector $\mathbf{x}_0 \in U$ on which $\varphi$ does not vanish and $\operatorname{Ker}(\tilde{\varphi}):= \big(\operatorname{Ker}(\varphi) \cap U \big)\oplus W$.

Interpretation of the constraint $\lvert \tilde{\varphi}(\mathbf{x}) \rvert \leq p(\mathbf{x})$ (in the "alternative version" of Hahn-Banach, where $p$ has positive value): for $\mathbf{u}\in U,\ \lvert \varphi(\mathbf{u}) \rvert \leq p(\mathbf{u}) $. The possible extensions are such that for every $\mathbf{u}\in U$, the leave $\tilde{\varphi}^{-1}\big( \varphi(\mathbf{u})\big) $ is contained in $p^{-1}\big(\big[\lvert \varphi(\mathbf{u}) \rvert , +\infty \big[\big)$ ($\lvert \tilde{\varphi}(\mathbf{x}) \rvert$ takes the constant value $\lvert \varphi(\mathbf{u}) \rvert \leq p(\mathbf{u})$ on that leave). In other terms, $\tilde{\varphi}^{-1}\big( \varphi(\mathbf{u})\big) = \mathbf{u} + \operatorname{Ker}(\tilde{\varphi})$ must not intersect the convex subset $p^{-1}\big(\big[0,\lvert \varphi(\mathbf{u}) \rvert \big[\big)$. (If $p$ were a norm, this would be a ball. It is convex because of subadditivity and homogeneity of $p$: indeed $p\big(t\mathbf{x} + (1-t)\mathbf{y}\big) \leq p(t\mathbf{x}) + p\big((1-t)\mathbf{y})\big) = t p(\mathbf{x}) + (1-t) p(\mathbf{y}) \leq (t + 1 - t)\ \mathrm{max}\big(p(\mathbf{x}),p(\mathbf{y})\big)$, for all $t \in [0,1] $. So if $\mathbf{x}, \mathbf{y}$ are in the "ball" of radius $R$, so is their convex combination.)


Let me also detail the relation between Hahn-Banach and the statement that "A finite dimensional subspace $U\subseteq V$ admits a supplementary" or here or here. Since $U$ is finite dimension, one can (without the axiom of choice) choose a basis $(\mathbf{e}_1, \cdots, \mathbf{e}_n)$. One cannot yet define the dual basis (of $V^*$), but only linear maps $l_i: \mathbb{R}\mathbf{e}_i \to \mathbb{R},\ \mathbf{e}_i \mapsto 1$ (one can also assume $l_1: U\to \mathbb{R}$) that can be extended to $V$. This yields the supplementary $$ W:=\bigcap_{i=1}^n \operatorname{Ker}(l_i)\quad ,\quad V = U \oplus W$$ Indeed $W$ is also the kernel of the following map which has components the $n$ linear forms

$$ L:\left\lbrace \begin{aligned} V &\longrightarrow \mathbb{R}^n,\\ \mathbf{x} & \longmapsto \big(\tilde{l}_1(\mathbf{x}), \cdots, \tilde{l}_n(\mathbf{x})\big) \end{aligned} \right. $$

For any $\mathbf{x}\in V$ define the "projection on $W$ w.r.t. $U$" by (note that there is no Euclidean structure) $$P:\left\lbrace \begin{aligned} V &\longrightarrow V\\ \mathbf{x} &\longmapsto \mathbf{x} - \sum_{i=1}^n \tilde{l}_i(\mathbf{x}) \mathbf{e}_i \end{aligned} \right.$$ One immediatly checks that $L\circ P = 0$ and that $P\circ P = P,\ \operatorname{Id}_V = (\operatorname{Id}_V - P) + P$, i.e. that any element $\mathbf{x}\in V$ decomposes as $\mathbf{x} = \big(\mathbf{x} - P(\mathbf{x})\big) + P(\mathbf{x})$ with the first term in $U$ and the second in $W$.

One can also bring in topological considerations, e.g. Proposition 1.5 p.5 of the above mentionned book by Haim Brezis (normed vector space case), namely that linear forms are continuous iff their kernel are closed. $W$ is then closed as an intersection of closed subsets.