Why is differential calculus built on open sets?

For example in W. Rudin: Principles of Mathematical Analysis in every theorem or definition regarding the derivatives of a function from $\mathbb{R}^n \to \mathbb{R}^m$ there it always says at the beginning: "Let $E$ be an open set in $\mathbb{R}^n$, $f$ maps $E$ to $\mathbb{R}^m$ ...".

Why is the differential calculus built on the open subsets of $\mathbb{R}^n$? What is the precise (maybe topological) explanation?


The best way to understand this is to understand what an open set is. Recall that in metric spaces like $\mathbb{R}^n$ we define an open set as a set $U$ such that for every $p \in U$ there is some real number $\epsilon > 0$ such that the ball of radius $\epsilon$ and center $p$ is inside of $U$. Now, it says then that points less than a distance $\epsilon$ from $p$ are still in $U$ and we can understand this as saying "as long as we move $p$ arround without going a distance greater than $\epsilon$ we are still in $U$". So an open set is really a set on which for every point there is a small amount that the point can be moved around and will still be inside of the set.

Now let's take the derivative. First, imagine we have a function $f : \mathbb{R}^n \to \mathbb{R}^m$, so this function is defined inside all of $\mathbb{R}^n$. We say that $f$ is differentiable at $a$ if there exists a linear map called derivative $Df(a) \in \mathcal{L}(\mathbb{R}^n,\mathbb{R}^m)$ such that

$$\lim_{h\to 0}\frac{|f(a+h)-f(a)-Df(a)(h)|}{|h|}=0$$

So, we start at $a$ and move along $h$. Compute $f$ when we begin, that's $f(a)$ and compute it again when we end moving, that's $f(a+h)$ then we take the difference $f(a+h)-f(a)$. So, $f$ is differentiable if we can approximate this with a linear map (this is what the definition says).

Now, what was important in that discussion, is that we could vary $f$ from $f(a)$ to $f(a+h)$. Now, this was possible because even if $h$ was imensely large in norm, the point $a+h$ would still be in the domain of $f$. If you have an arbitrary set of $\mathbb{R}^n$ it could happen that $a+h$ would end up going away from the set. Since we shrink $h$ to zero, it's not needed that for every $h$ the point $a+h$ is in the set, but that for sufficiently small $h$ it is (smallness with respect to the norm of course).

Because of that we require $U$ to be open: it grants us that for every point there will be really an amount of distance that we can move the point $a$ to $a+h$ without going away from the domain of definition.

If you want an example, consider $A = \{(x,y)\in \mathbb{R}^2 : x^2+y^2 \leq 1\}$. This set is not open. Now consider the point $a=(0,1)$, if we take $\epsilon >0$ and consider the vector $h=(0,\epsilon)$, then $a+h=(0,1+\epsilon)$ and this point is not inside $A$ anymore. If $f : A \to \mathbb{R}^m$, then what should be $f(a+h)$? This is undefined, so the derivative doesn't make sense.


This is how I see it.

Say we want to investigate the concept of a function to be "continuous", i.e. to have "no jumps". We notice that to look for continuity at a point it is not enough to look at the point itself, but we also need a little neighborhood of the point, so we define the open ball $B_r(p)$ of points around $p$ with distance less than $r$. We want to extend this notion to more general subsets of $\mathbb{R}^n$, and so we decide to call a subset "open" if for every point in the set we have a ball around this point contained in the set itself. This way we can investigate our local properties staying strictly inside our set.

Later, the concept has been extended to the topological notion of open set we have today.