How is the epsilon-delta definition of continuity equivalent to the following statement?

Claim: A function $f: \mathbb{X} \to \mathbb{Y}$ is continuous if given any open set $\mathbb{U} \subseteq \mathbb{Y}$ the inverse image $f^{-1} (\mathbb{U}) \subseteq \mathbb{X}$ is open.

How is this definition intuitively compatible with the $\epsilon-\delta$ definition of continuity?

Furthermore, on an application level, how is the above claim applicable in proving continuity of a function? i.e. is it useful for say proving that $f(x) = e^x$ is continuous?


Intuitively, if $U$ is open but $f^{-1}(U)$ is not, then $f^{-1}(U)$ contains a point $x_0$ such that for any neighborhood of $x_0$, however small, contains points outside of $f^{-1}(U)$. In other words, one can choose a point $x$ arbitrarily close to $x_0$ such that $f(x)\notin U$, even though $f(x_0)\in U$. For such a point $x$ very, very close to $x_0$, the value of $f(x)$ abruptly “jumps” outside the open set $U$, which is a violation of our intuitive concept of continuity: If $f$ were continuous, then one would expect that for a point $x$ very close to $x_0$, $f(x)$ should be very close to $f(x_0)$.


Formally, I assume we work in metric spaces $(\mathbb X, d_{\mathbb X})$ and $(\mathbb Y,d_{\mathbb Y})$.

The inverse-image definition implies the $\varepsilon$-$\delta$ definition.

Suppose that the inverse-image criterion is satisfied. Let $x_0\in\mathbb X$ and $\varepsilon>0$. Then, the ball $$B_{\mathbb Y}(\varepsilon, f(x_0))\equiv\{y\in\mathbb Y\,|\,d_{\mathbb Y}(y,f(x_0))<\varepsilon\}$$ of radius $\varepsilon$ about $f(x_0)$ is open in $\mathbb Y$, hence $f^{-1}(B_{\mathbb Y}(\varepsilon,f(x_0)))$ is open in $\mathbb X$. Since $x_0\in f^{-1}(B_{\mathbb Y}(\varepsilon,f(x_0)))$, there exists some ball of radius $\delta>0$ about $x_0$ such that $$B_{\mathbb X}(\delta,x_0)\subseteq f^{-1}(B_{\mathbb Y}(\varepsilon,f(x_0)))$$ This is exactly the $\varepsilon$-$\delta$ criterion: if $x\in \mathbb X$ is such that $d_{\mathbb X}(x,x_0)<\delta$, then $d_{\mathbb Y}(f(x),f(x_0))<\varepsilon$.

The $\varepsilon$-$\delta$ definition implies the inverse-image definition.

Suppose that the $\varepsilon$-$\delta$ criterion holds and let $U\subseteq\mathbb Y$ be open. By the definition of openness in metric spaces, there exists for each $y\in U$ some $\varepsilon_y>0$ such that $$B_{\mathbb Y}(\varepsilon_y,y)\subseteq U.$$ In fact, it is not difficult to check that $$U=\bigcup_{y\in U}B_{\mathbb Y}(\varepsilon_y,y).\tag{$\clubsuit$}$$ I now claim that $f^{-1}(U)$ is open in $\mathbb X$. Suppose that $x_0\in f^{-1}(U)$. Then $f(x_0)\in U$, so $f(x_0)\in B_{\mathbb Y}(\varepsilon_{y_0},y_0)$ for some $y_0\in U$ by ($\clubsuit$). [In fact, as @Dominik pointed out in a comment below, one can take $y_0\equiv f(x_0)$. This observation allows to make the derivation that follows a lot simpler.] That is $d_{\mathbb Y}(f(x_0),y_0)<\varepsilon_{y_0}$. Define $$\xi\equiv\varepsilon_{y_0}-d_{\mathbb Y}(f(x_0),y_0)>0.\tag{$\star$}$$ By the $\varepsilon$-$\delta$ definition of continuity, there exists some $\delta>0$ such that $$\text{if }x\in\mathbb X\text{ and }d_{\mathbb X}(x,x_0)<\delta\text{, then }d_{\mathbb Y}(f(x),f(x_0))<\xi.\tag{$\diamondsuit$}$$ I now claim that $$B_{\mathbb X}(\delta,x_0)\subseteq f^{-1}(U),\tag{$\spadesuit$}$$ which will show that $f^{-1}(U)$ is open (since its generic element $x_0$ has a ball around it still in $f^{-1}(U)$), as desired. To this end, let $x\in B_{\mathbb X}(\delta,x_0).$ That is, $d_{\mathbb X}(x,x_0)<\delta$. Then, by ($\diamondsuit$), one has that $$d_{\mathbb Y}(f(x),f(x_0))<\xi.$$ In turn, the triangle inequality and ($\star$) imply that $$d_{\mathbb Y}(f(x),y_0)\leq d_{\mathbb Y}(f(x),f(x_0))+d_{\mathbb Y}(f(x_0),y_0)<\xi+d_{\mathbb Y}(f(x_0),y_0)=\varepsilon_{y_0}.$$ This means that $f(x)\in B_{\mathbb Y}(\varepsilon_{y_0},y_0)\subseteq U$, so that $x\in f^{-1}(U)$. Therefore, ($\spadesuit$) holds, as claimed.


triple_sec has already written a detailed explanation as to why both definitions are equivalent, so I will try to give reasons as to why the other definition is useful.

The definition with the open sets doesn't need any concept of distance, only a concept of what an open set is in each space. This can be used to generalize the definition of continuous functions to general topological spaces.

More specifically, for a set $X$ we call a collection of subsets $\tau \subset \mathcal{P}(X)$ a topology if the following three conditions hold:

  1. $\emptyset \in \tau$ and $X \in \tau$.
  2. If $(A_i)_{i \in I}$ is a family of sets from $\tau$, then $\bigcup \limits_{i \in I} A_i \in \tau$.
  3. If $A_1, \ldots, A_n$ is a finite collection of elements from $\tau$, then $\bigcap \limits_{i = 1}^n A_i \in \tau$.

The pair $(X, \tau)$ is called a topological space and the sets $A \in \tau$ are called open sets.

We can call a set $A$ in a metric space $\epsilon$-open*, if for every point $x_0 \in A$ there is a $\epsilon > 0$ for which $B_X(\epsilon, x_0) \subset A$. This is probably the definition of an open set that you've seen so far. It is easy to check that $\{A \subset X \;|\; A \text{ is $\epsilon$-open}\}$ is a topology on $X$. This way, every metric on a space $X$ induces a topology on $X$.

Now using the definition of continuity that only needs a concept of open sets, we can generalize the notion of a continuous function between two metric spaces to a continuous function between two topological spaces. This is a pretty big generalization, as not every topology on a topological space is induced by a metric [for example, take any non-Hausdorff space].

The definition via open sets can in some cases also be applied to show certain properties of a set. In a topological space $(X, \tau)$ we call a set A closed iff its complement $A^c$ is an open set. It is now again easy to see that the notion of a closed set in the metric setting is the same as the notion of a closed set in the corresponding topological setting. Now consider the function $f: \mathbb{R}^n \to \mathbb{R}$, $x \mapsto ||x||_2$. It is easy to see that this function is continuous if we endow $\mathbb{R}^n$ and $\mathbb{R}$ with their standard-topology [i.e. the topology induced by the euclidean distance]. Now the definition of continuity that uses open sets shows immediately that $\mathbb{S}^{n - 1} = f^{-1}(\{1\}) = (f^{-1}(\{1\}^c))^c$ is a closed set.

Another important observation is that two different metrics can induce the same topology. For example, the $p$-norms on $\mathbb{R}^n$ with $0 < p \le \infty$ all induce the same topology. Now if we endow $\mathbb{R}^n$ and $\mathbb{R}^m$ with arbitrary $p$-norms, we can see that the continuity of a function $f: \mathbb{R}^n \to \mathbb{R}^m$ doesn't depend on which $p$ you choose in each respective space.

An often used result in this context is, that on a finite-dimensional real vector space all norms are equivalent. This implies that if we endow two finite-dimensional normed spaces $V, W$ with topologies that are induced by norms, the continuity of a function doesn't depend at all on the specific choice of our norms.

*Please not that the term "$\epsilon$-open" is not a generally used mathematical term. I only used it here to differentiate between the two notions of open sets I've used in this post.


All about generating bases of open balls, the triangle inequality and inverse images.

A picture found in An Introduction to Metric Spaces might be helpful.

To show that the open balls form a basis for a topology, you need the triangle inequality.

Also (using @triple-sec notation), given the open set $ f^{-1}(B_{\mathbb Y}(\varepsilon,f(x_0)))$ containing $x_0$, you know that this set is the union of open balls. But again, you need the triangle inequality to get $x_o$ in its own smaller $\delta $ ball.

If you do this enough, you just know that for any open set $U$ in $X$ and any $x_o$ in $U$,

$U$ = $U \cup B_{\mathbb X}(\delta,x_0)$ for some $\delta$.

enter image description here