Misleading Definitions and Unanswerable Problems in Spivak's Calculus

Spivak's Calculus 4th edition, Chapter 3 Functions, Page 47:

DEFINITION: A function is a collection of pairs of numbers with the following property: if (a,b) and (a,c) are both in the collection, then b = c; in other words, the collection must not contain two different pairs with the same first element.

He continues:

This is our first full-fledged definition, and illustrates the format we shall always use to define significant new concepts. These definitions are so important (at least as important as theorems) that it is essential to know when one is actually at hand, and to distinguish them from comments, motivating remarks, and casual explanations. They will be preceded by the word DEFINITION, contain the term being defined in boldface letters, and constitute a paragraph unto themselves.

Ok got it. Fast forward to the problem set, I'm up to problem 25)

25) Find a function f(x) such that g(f(x)) = x for some g(x), but such that there is no function h(x) with f(h(x)) = x

I asked about this on MSE (see Spivak's Calculus Chapter 3 Problem 25), and after much discussion came to the resolution that the problem 25 is impossible given the above definition of a function.

In summary, the problem is that in order to find f(x), you need to choose f(x) to be a non-surjective function, which requires the concept of codomain. However the above definition does not include the concept of codomain, only domain and image, hence it is impossible.

My question is: What was Spivak thinking (and I mean literally)? How did Spivak intend for us to solve this problem? Is it an error on his part? Or is there a way to intuitively infer from the above definition that functions must be specified a codomain (codomain, not image) in order to be defined?

If I had the solutions book to Calculus 4th Edition, this would greatly help answer my question, but I can't find it anywhere for free.

Spivak's definition of a function is limited to a very special case. In fact, at the beginning of Chapter 3 Spivak gives the following "provisional definition":

A function is a rule which assigns, to each of certain real numbers, some other real number.

This describes the scope of his book. He does not deal with the general set-theoretic concept of a function, his naive concept only requires two explicit components:

A subset $A \subset \mathbb R$ of "certain real numbers".
A rule assigning to each $x \in A$ an element $f(x) \in \mathbb R$.

On p. 40 he explicitly defines the domain of a function as the set of numbers to which it does apply. In other words, $\operatorname{domain}(f) = A$. He does not introduce the modern concept of codomain, but he says that all functions take values in $\mathbb R$ which may be regarded as an antiquated way to express that they have codomain $\mathbb R$. In modern terms we would therefore write a function in the sense of Spivak as $f : A \to \mathbb R$ with $A \subset \mathbb R$. However, note that Spivak does not need an explicit concept of codomain to introduce functions. All he needs are 1. and 2. above.

On p. 44 Spivak defines the composition $f \circ g$ of two functions $f$ and $g$ as follows: The domain of $f \circ g$ is the set $\{ x \in \operatorname{domain}(g) \mid g(x) \in \operatorname{domain}(f) \}$, the assigment rule is given by the formula $(f \circ g)(x) = f(g(x))$.

Now let us analyze Spivak's formal definition of a function as a certain collection of pairs of numbers. Thus, more precisely, a function $f$ is a subset of $\mathbb R \times \mathbb R$ with the uniqueness property $(a,b), (a,c) \in f \implies b = c$. The next definition introduces the domain of $f$ as the set $\operatorname{domain}(f) = \{a \in \mathbb R \mid \exists b \in \mathbb R : (a,b) \in f \}$ and introduces the rule of assignment $a \mapsto f(a)$ occurring in the provisional definition by simply observing that for each $a \in \operatorname{domain}(f)$ there exists a unique $f(a) \in \mathbb R$ such that $(a,f(a)) \in f$.

Since functions are defined as subsets of $\mathbb R \times \mathbb R$, it is automatically clear what it means that two functions are equal: This is simply equality of sets. If we want, we can write Spivak's functions in the usual form $f : A = \operatorname{domain}(f) \to \mathbb R$, but we can also avoid this notation.

For later use let us add the definition of the image of $f$ as $$\operatorname{image}(f) = \{f(x) \mid x \in A = \operatorname{domain}(f) \} \subset \mathbb R$$ which is also written as $f(A)$.

The composition $f \circ g$ of two functions $f$ and $g$ is now formally defined as follows (Spivak does not do that, probably because it is obvious in the light of the above informal definition): $$f \circ g = \{ (x,f(g(x))) \in \mathbb R \times \mathbb R \mid x \in \operatorname{domain}(g) \text{ such that } g(x) \in \operatorname{domain}(f) \} .$$

Note that always $$\operatorname{domain}(f \circ g) \subset \operatorname{domain}(g) \tag{1}$$ $$\operatorname{image}(f \circ g) \subset \operatorname{image}(f) \tag{2}$$

Now problem 25 makes perfectly sense, but you did not state it properly in your question. It uses the identity function $I$ introduced on p. 43 which has domain $\mathbb R$. Formally $I = \{(x,x) \mid x \in \mathbb R \}$. Concerning the domain of $I$ recall the convention on the bottom of p. 41: "Unless the domain is restricted further, it is understood to consist of all numbers for which the definition makes sense at all."

The problem asks to find a function $f$ such that

$f$ has a left inverse $g$ which means that $g \circ f = I$.
$f$ does not have a right inverse $h$ which would mean that $f \circ h = I$.

Note that a necessary condition for $f$ having a left inverse is that $\operatorname{domain}(f) = \mathbb R$. In fact, if $g$ is a left inverse for $f$, then $\operatorname{domain}(g \circ f) = \operatorname{domain}(I) = \mathbb R$, hence $\operatorname{domain}(f) = \mathbb R$ by $(1)$.

Similarly a necessary condition for $f$ having a right inverse is that $\operatorname{image}(f) = \mathbb R$, i.e. that $f$ is surjective. In fact, if $h$ is a right inverse for $f$, then $\operatorname{image}(f \circ h) = \operatorname{image}(I) = \mathbb R$, hence $\operatorname{image}(f) = \mathbb R$ by $(2)$.

The functions $f$ satisfying the above conditions are precisely the injective but non-surjective functions $f : \mathbb R \to \mathbb R$, or if you want to avoid this notation, the injective functions $f$ such that $\operatorname{domain}(f) = \mathbb R$ and $\operatorname{image}(f) \ne \mathbb R$.

However, I must admit that Spivak is not really precise concerning $I$. It seems that in one case he also uses the symbol $I$ for a restricted function $I_A = \{(x,x) \mid x \in A \}$ with a general $A \subset \mathbb R$. In problem 24 (a) he claims that each injective function $g$ has a left inverse $f$ such that $f \circ g = I$. But this is only true if $\operatorname{domain}(g) = \mathbb R$. In case $\operatorname{domain}(g) = A \subsetneqq \mathbb R$ we can only achieve that $f \circ g = I_A$. In all other parts of the problems 23 - 26 the standard $I$ fits.

Remark:

I do not claim that Spivak's approach is a particular good one. It is, however, consistent in itself. I would prefer to introduce a function as a triple $(X,Y,f)$ consisting of a set $X$ called domain, a set $Y$ called codomain and a subset $f \subset X \times Y$ such that for each $x \in X$ there exists a unique $y \in Y$ such that $(x,y) \in f$. This is much more flexible and allows a real understanding of surjectivity. In fact, Spivak's approach does not allow us to consider functions $f : A \to B$ where $A, B \subset \mathbb R$ are arbitrary. He only covers the case $B = \mathbb R$. Moreover, his functions are always subsets of $\mathbb R \times \mathbb R$ and not of $A \times \mathbb R$ or $A \times B$.

You don't need the concept of a codomain; you just need the much more basic assumption that all of the functions in the problem have the same domain. And the domain is implicit in the definition of a function as a set of ordered pairs: it's just the set of first components of the members of $f$. (By the way, the definition of a function as a set of ordered pairs is still the most useful one in some contexts.)

For instance, let the common domain be $\Bbb N$, the set of non-negative integers, and let $f(n)=n+1$; i.e., $f=\{\langle n,n+1\rangle:n\in\Bbb N\}$. Let $g=\{\langle n,n-1\rangle:n\in\Bbb N\}$; clearly $g\big(f(n)\big)=n$ for all $n\in\Bbb N$. But there is no function $h$ with domain $\Bbb N$ such that $f\big(h(n)\big)=n$ for all $n\in\Bbb N$, because $0$ is not in the range of $f$.

Without access to the book I can only guess, but I shouldn't be at all surprised if he had such an assumption and such an example in mind.

I never would've understood what was wrong with my thinking without the answers and explanations given from everyone below (particular thanks to Paul Frost). Nevertheless, I am answering my own question not because my answer will be more correct/precise/better in anyway than anyone else's, but because I do know why I originally thought the incorrect way I did, and if you've stumbled upon this question, there's a chance your way of thinking is the same as mine was. If not, try the other answers.

Q25) Find a function f(x) such that g(f(x)) = x for some g(x), but such that there is no function h(x) with f(h(x)) = x

When I read g(f(x)) = x, to me that meant if you choose a specific number x, such that g(f(x)) is defined, then when you compute g(f(x)) you get back x.

Likewise, f(h(x)) = x meant if you choose a specific number x, such that f(h(x)) is defined, then when you compute it you get back x.

With this way of thinking, the question is indeed impossible. Or it is possible but extra assumptions are needed or the concept of codomain needs to be embedded into the definition of a function, etc.

However, this way of thinking is wrong. The trick is to realise that g(f(x)) is not a number, it is a function. That means if g(f(x)) = x, x is also a function, not a number. Likewise, x is a function in: f(h(x)) = x. And here's the crux, IT IS THE SAME FUNCTION. You could write g(f(x)) = f(h(x)) if you want to. And 2 functions can only be equal if they have the same domain. Spivak does not state this explicitly, but you can infer it from his definition like so:

What is a function? Recall Spivak's definition: A function is a Set of Ordered Pairs. So what does g(f(x)) = x mean? It means the Set of Ordered Pairs "g(f(x))" is equal to the Set of Ordered Pairs "x". And when are 2 Sets equal to each other? Precisely when every element in the first Set is also in the second Set, and vice versa. By Definition, Domain is the Set of every first element in each Ordered Pair. So if 2 functions are equal, they must have the same first element in each Ordered Pair, hence their domains are equal.

Now take f(h(x)) = x. Same logic applies. "f(h(x))" is a Set of Ordered Pairs, and it's equal to the Set of Ordered Pairs "x", which means every element in "f(h(x))" must be in "x", and vice versa.

VICE VERSA Those 2 words are what make Q25 possible. My old naive way of thinking was the equivalent to "every element in "f(h(x))" must be in "x", but NOT vice versa.

Q25) Let I(x) = x. Find a function f(x) such that g(f(x)) = I(x) for some g(x), but such that there is no function h(x) with f(h(x)) = I(x).

That is the actual question written in the book. I skipped the I(x) part because I thought it was unnecessary, but it does emphasise the fact that g(f(x)) and f(h(x)) are equal to the same identity function, and therefore their Set of Ordered Pairs are equal, or equivalently their domains are equal.

My Apologies Michael.

Misleading Definitions and Unanswerable Problems in Spivak's Calculus

Related

Recent Posts