Why can't differentiability be generalized as nicely as continuity?

The question: Can we define differentiable functions between (some class of) sets, "without $\Bbb R$"* so that it

  1. Reduces to the traditional definition when desired?
  2. Has the same use in at least some of the higher contexts where we would use the present differentiable manifolds?

Motivation/Context:

I was a little bit disappointed when I learned to differentiate on manifolds. Here's how it went.

A younger me was studying metric spaces as the first unit in a topology course when a shiny new generalization of continuity was presented to me. I was thrilled because I could now consider continuity in a whole new sense, on function spaces, finite sets, even the familiar reals with a different metric, etc. Still, I felt like I hadn't escaped the reals (why I wanted to I can't say, but I digress) since I was still measuring distance (and therefore continuity, in my mind) with a real number: $$d: X\to\huge\Bbb R$$

If the reader for some reason shares or at least empathizes with (I honestly can't explain this fixation of mine) the desire to have definitions not appeal to other sets/structures*, then they will understand my excitement in discovering an even more general definition, the standard one on arbitrary topological spaces. I was home free; the new definition was totally untied to the ever-convenient real numbers, and of course I could recover my first (calculus) definition, provided I toplogized $\Bbb R^n$ adequately (and it seemed very reasonable to toplogize $\Bbb R^n$ with Pythagoras' theorem, after all).

Time passed and I kept studying, and through other courses (some simultaneous to topology, some posterior) a new sort of itch began to develop, this time with differentiable functions. On the one hand, I had definitions (types of convergence, compact sets, orientable surfaces, etc.) and theorems (Stone-Weierstrass, Arzelá-Ascoli, Brouwer fixed point, etc.) completely understandable through my new-found topology. On the other hand, the definition of a derivative was still the same as ever, I could not see it nor the subsequent theorems "from high above" as with topological arguments.

But then a new hope (happy may 4th) came with a then distant but closely approaching subject, differential geometry. The prospect of "escaping" once again from the terrestrial concepts seemed very promising, so I decided to look ahead and open up a few books to see if I could finally look down on my old derivative from up top in the conceptual clouds. My expectation was that, just like topology had first to define a generalized "closeness structure" i.e. lay the grounds on which general continuous functions could be defined via open sets, I would now encounter the analogous "differentiable structure" (I had no idea what this should entail but I didn't either for topology so why not imagine it). And so it went: "oh, so you just... and then you take it to $\Bbb R^n$... and you use the same definition of differentiable".

Why is this so? How come we're able to abstract continuity into definitions within the same set, but for differentiability, we have to "pass through" the reals? I realize that this really has to do with why we have to generalize in the first place, so what happens is that the respective generalizations have usefulness in the new contexts, hence the second point in my question statement.

Why I imagine this is plausible, a priori, is because there's a historical standard: start with the low-level definitions $\rightarrow$ uncover some properties $\rightarrow$ realize these are all you wanted anyhow, and redefine as that which possesses the properties. Certainly, derivatives have properties that can be just as well stated for slightly more general sets! (e.g. linearity, but of course this is far from enough). But then, we'll all agree that there's even been a lust for conducting the above process, everywhere possible, so maybe there are very strong obstructions indeed, which inhibit it's being carried out in this case. In this case, I should ask what these obstructions are, or how I should begin identifying them.

Thank you for reading this far if you have, I hope someone can give some insight (or just a reference would be great!).

* If I'm being honest, before asking this I should really answer the question of what on earth I mean, precisely, by "a structure that doesn't appeal to another". First of all, I might come across a new definition that apparently doesn't use $\Bbb R$, but is "isomorphic" to having done so (easy example: calling $\Bbb R$ a different name). Furthermore, I'm always inevitably appealing to (even naïve) set theory, the natural numbers, etc. without any fuss. So, if my qualms are to have a logical meaning, there should be a quantifiable difference in appealing to $\Bbb R$ vs. appealing to set theory and other preexisting constructs. If the respondent can remark on this, super-kudos (and if they can but the answer would be long and on the whole unrelated, say this and I'll post another question).


Solution 1:

There "is" a way, since in algebraic geometry, we do not work over the real numbers in general, yet we use techniques inspired from differentiation all the time. It is not the way preferred by most differential geometry textbooks who stick to charts and differentiable structures, but it still works.

A ringed space $(X,\mathcal O_X)$ is a topological space together with a sheaf of rings $\mathcal O_X$ on it. A locally ringed space is a ringed space such that the stalks $\mathcal O_{X,p}$ are local rings for each $p \in X$. If $(X,\mathcal O_X)$ is a locally ringed space, we can define the cotangent space at $p$ via $\mathfrak m_{X,p}/\mathfrak m_{X,p}^2$ where $\mathfrak m_{X,p}$ is the unique maximal ideal of $\mathcal O_{X,p}$. This is a vector space over the field $k(p) \overset{def}= \mathcal O_{X,p}/\mathfrak m_{X,p}$, so we can define the tangent space as the dual $k(p)$-vector space to $\mathfrak m_{X,p}/\mathfrak m_{X,p}^2$, or in other words, the set of all linear maps $\mathfrak m_{X,p}/\mathfrak m_{X,p}^2 \to k(p)$.

The idea is that if you have a linear map $\mathfrak m_{X,p}/\mathfrak m_{X,p}^2 \to k(p)$, and you are given a "function" $f \in \mathfrak m_{X,p}$, the value of the linear map at $f$ should give you the "direction"al derivative of $f$.

Of course, this level of abstraction removes any reference to coordinate patches and such, so it is hard to see what's going on. To remove all the algebro-geometric nonsense and get a particular example, take a manifold $M$ (which is a topological space) and consider the sheaf $\mathcal O_M$ of $C^{\infty}$-functions on $M$, that is, if $U \subseteq M$ is an open set, $\mathcal O_M(U)$ is the set of smooth functions $U \to \mathbb R$. Then $\mathcal O_{M,p}$ consists of all germs of functions at $p$, and $\mathfrak m_{M,p}$ is the maximal ideal of those germs whose value at $p$ is zero. The ideal $\mathfrak m_{M,p}^2$ is the ideal of all finite sums of products of two functions in $\mathfrak m_{M,p}$, and in particular such functions always vanish with multiplicity $\ge 2$ (this is the product rule for differentiation). It is usually shown that the dual of $\mathfrak m_{M,p}/\mathfrak m_{M,p}^2$ corresponds to the space of all derivations at $p$ ; note that in this case, we have $k(p) \simeq \mathbb R$, so that the linear maps $\mathfrak m_{M,p}/\mathfrak m_{M,p}^2 \to k(p)$ actually take values in the real numbers.

Of course, the sheaf $\mathcal O_M$ needs to be defined, and this is usually done at some point using coordinate patches ; you do not get away from dealing with charts when doing differential geometry. Sure, at some point you stop using them if you deal with coordinate-free approaches, but they lie somewhere in the treatment of the theory. My point is that the ideas of differentiation do generalize, and this is just a quick glance of how it does ; algebraic geometry takes "differentiation" to a whole new level.

As continuity has been weakened and played with in many different ways (weak/weak-star convergence in functional analysis, semi-upper-lower-continuity in optimization, etc.) and differentiation too (Fréchet, Gâteaux, directional derivatives, semi-directional derivatives, Dini derivatives), you should always remember one thing : yes generalizations are useful, but you should never forget why you wanted to generalize in the first place. It can be either because you want to pay attention to a certain class of problems that you cannot solve and want to have a clearer point of view or to build stronger tools, but generalizing for the sake of generalizing usually leads to being confused and losing intuition, which is not what you want. To this day I am still scared of the use of Dini derivatives...

Hope that helps,

Solution 2:

It seems some of the frustration is with having to define manifolds from the inside out, by pasting together coordinate charts. It's possible to axiomatize a notion of "smooth object" in such a way that we can work with smooth functions on analogues of manifolds, function spaces, products, quotients, intersections, infinitesimally small spaces, and anything else you (or, at least, I) can imagine, and thus define spaces from the outside in, never using coordinates or charts in the definitions. This theory is called synthetic differential geometry (SDG.)

SDG gives a very different definition of smooth function than one uses classically, because one works in a logical framework in which it's impossible to even define non-smooth functions. So, synthetically, a smooth function is...any function between spaces in a model of SDG! This doesn't mean much until you know about what it takes to be a model of SDG. What SDG axiomatizes is what an object $R$ has to do to be able to act like a smooth line to do some differential geometry: so it has some elements which square to zero on which every function is linear (which leads to the definition of derivative,) a much bigger collection of nilpotent elements on which every function is defined by its Taylor series, every function on it has an antiderivative, etc...Then the other smooth objects all have their maps determined, at least locally, by their relationship to this "line" and its infinitesimally small subspaces.

So perhaps this sounds a bit too much like basing manifolds on $\mathbb{R}$ to you. I can assure you that $R$ can be wildly more exotic than $\mathbb{R}$, if that helps; and again objects in SDG are not by any means constructed locally out of finite products of $R$. A more compelling objection is probably that you've never heard of this, which is because, as you can already tell, it's very different from the geometry you know; and to actually develop the foundations requires large amounts of category theory that most geometers don't want to learn. However, the good news is that it's possible to do all of classical differential geometry within this framework, so that one can, at least in principle, think in synthetic terms and then translate proofs into language more familiar to the community.

Solution 3:

Patrick Da Silva already gave a great answer. This is just an answer to complement. I am also not answering your question exactly since I am cheating, in my definition I am using the word "smooth" which already implies I am using differentiability, which is what is bothering you. I just want to point out how differential operators can be defined without local definitions. If you combine this with Patrick Da Silva you might obtain a construction ignoring my abuse of the word smooth.

In fact, differential operators can be set in a purely algebraic abstract way without even requiring a manifold. People use this in commutative algebra. Here are some lecture notes with the more general construction but it will require a perhaps mature level of algebra to be able to grasp it.

Here you have a simplified version for the tangent bundle of manifold $M$, to avoid using general vector bundles. The construction takes a while to digest, but it is very interesting. I apologise if the following it does not make too much sense as it I write it. I am definitely skipping many details. I am summarising the presentation given in here.

1) Define $\mathrm{Op}(TM)$ as the space of linear operators $T:\Gamma(TM)\to \Gamma(TM)$ (here I use $\Gamma$ to mean smooth sections, note that a smooth section of $TM$ is a by definition a smooth vector field).

2) For each smooth function $f\in C^\infty(TM)$, define a map $\mathrm{ad}(f)$ with domain $\mathrm{Op}(TM)$ such that for each $T\in\mathrm{Op}(TM)$ you obtain a map $\mathrm{ad}(f)(T)\colon \Gamma(TM)\to\Gamma(TM)$ defined by $$ (\mathrm{ad}(f)T)u = [T,F]u:=T(fu)-f(Tu),\quad \forall u\in \Gamma(TM)$$

3) Now inductively define a sequence of subspaces $$ \mathrm{PDO}^{(0)}(TM) \subset \mathrm{PDO}^{(1)}(TM) \subset ... \subset \mathrm{PDO}^{(k)}(TM) \subset ... $$ Following the prescription: $$ \mathrm{PDO}^{(0)}(TM) = \mathrm{hom}(TM,TM)$$ (thinks of this as a collection of maps $TM_x\to TM_x$ for all $x\in M$) and $$ \mathrm{PDO}^{(k+1)}(TM) = \{ T\in\mathrm{Op}(TM);[T,f]\in \mathrm{PDO}^{(k)}(TM), \forall f\in C^\infty(M) \}.$$ The elements of $\mathrm{PDO}^{(k)}(TM)$ are called partial differential operators of order $k$. Your job here (not a super easy one) is to convince yourself that this notion agrees with your idea of what a differential operator should be.

Solution 4:

To take a different approach to answering this, you might ask yourself why the real numbers seem to be so inescapable. Of course several answers have been given that do in fact escape the reals, but they do this by focusing on the algebraic properties of derivatives, which might be unsatisfying to you (or maybe not!)

One answer to the question "why are the reals inescapable" is that the real numbers are the unique complete ordered field (see http://en.wikipedia.org/wiki/Completeness_of_the_real_numbers). So if you want to define something that captures all the topological and algebraic properties of the derivative you are going to be stuck with the real numbers.

Solution 5:

If you consider differentiation as approximating the behavior locally by a linear function, and it will, I think, be obvious why it is hard to generalize.

You need to know what it means to approximate, you need to know what it means to be linear (or some suitable similar set of "simple" functions, etc.) These are not natural for all spaces. When a spaces is curved, what is a linear function?

In particular, differentiation needs a notion of 'direction.'

Even in functions $f:\mathbb R^n\to\mathbb R^n$, the derivative is an $n\times m$ matrix, and you have:

$$f(\mathbf v+\mathbf h)\approx f(\mathbf v)+T\mathbf h$$

where $T$ is a matrix. You can think of this derivative as a map from the same domain to the same range, but mathematicians realized, when dealing with curved spaces, that it is really best to see it as a map of the "tangent space" at $\mathbf v$ to the tangent space of $f(\mathbf v)$. It's just that $\mathbb R^n$ is "flat" so that the tangent space is the same as the space for all $\mathbf v$.

The "tangent space" at a point is then a codification of the notion of "direction" at the point. All of that nonsense in the definitions is an attempt to make sure that the tangent spaces of nearby points "agree" in some sense. And yes, it is ugly. As others have mentioned, it becomes easier to follow with the notion of sheaves, and how they follow from most things which are "local" properties.