natural deduction: introduction of universal quantifier and elimination of existential quantifier explained

Example

Let $\Gamma$ the set of first-order Peano axioms: no variables free.

1) $\Gamma \vdash \exists x (x = 0)$ --- easily provable

2) $\Gamma, x=0 \vdash x=0$ --- obvious

3) $\Gamma \vdash x=0$ --- from 1) and 2) by $\exists$-elim : wrong !

4) $\Gamma \vdash \forall x (x=0)$ --- from 3) by $\forall$-intro,


1) $\Gamma, x=0 \vdash x = 0$

2) $\Gamma, x=0 \vdash \forall x (x=0)$ --- by $\forall$-intro: wrong !

3) $\Gamma \vdash x=0 \to \forall x (x=0)$ --- from 2) by $\to$-intro

4) $\Gamma \vdash \forall x [x=0 \to \forall x (x=0)]$ --- from 3) by $\forall$-intro

5) $\Gamma \vdash 0=0 \to \forall x (x=0)$ --- from 4) by $\forall$-elim.



The ground for the restriction on $\forall$-intro are linked to the "generalization principle":

what holds for any, holds for all.

Thus, in order to formalize this principle with a rule of inference, we read it as:

if something holds for an "arbitrary object", then it holds for all objects.

We have to capture the informal concept of “arbitrary object” by way of a syntactic criterion.

Consider now a variable $x$ in the context of a derivation: we shall call $x$ arbitrary if nothing has been assumed concerning $x$. In other words, $x$ is arbitrary at its particular occurrence in a derivation if the part of the derivation above it contains no hypotheses containing $x$ free.


My intuition is that propositions without free variables are fully general, while a proposition with a free variable $x$ is a statement about a specific thing named $x$. For example, $\forall x : \text{IsRed}(x)$ means "everything is red", but $\text{IsRed(x)}$ means "the thing called $x$ is red".

So suppose we already derived $\text{IsRound}(x) \vdash \text{IsRed}(x)$. This means that we know that if the thing called $x$ is round, it has to be red. Now, if we disregarded the restriction, we could wrongly conclude $\text{IsRound}(x) \vdash \forall x : \text{IsRed}(x)$. This means that if the thing called $x$ is round, then everything is red. Note that even though the two $x$s are the same letter, they represent different things: The first one refers to the object named $x$, the gets all its meaning from the quantifier where it was bound.

Now, if the $\forall$-introduction rule is used correctly, it captures the intuition that if I say "Paul has a nose" without assuming (/observing) anything about Paul, then I know that everything has a nose.


Now, let's look at the rule for $\exists$-elimination. To extend my methaphor, it says that if I know that something has a nose, and if I also know that if Paul (some "generic" person in the sense that we don't assume anything else about him) has a nose, then grass must be green, then we also know that grass is green.

Now, let's see what could go wrong if we violate the two restrictions:

  • Let's say we already derived that $\text{isRed}(x) \vdash \exists y : \text{isRound}(y)$, and we also derived $\text{isRed}(x), \text{isRound}(x) \vdash 1 = 0$. Then, we might incorrectly conclude that $\text{isRed}(x) \vdash 1 = 0$ if we ignore the restriction that $x$ can't be free in $\Gamma$.

    To abuse my metaphor a bit, in this case Paul (or $x$) stopped being a "generic" person because we assumed some properties about him that might make him contribute to grass being green.

  • Suppose we already derived that $\vdash \exists y : \text{isRound}(y)$ and also that $\text{isRound}(x) \vdash \text{isRed}(x)$. Then, we might incorrectly conclude that $\vdash \text{isRed}(x)$, if we ignore the rule that $x$ can't be free in $B$.

    To torture my metaphor a little bit more, in this case instead of concluding something general like "grass is green", we concluded some statement about Paul. This is incorrect since we intended to use Paul as some kind of stand-in for the thing that has a nose (which we know exists), so we can't conclude any statements about the actual Paul!


$$\frac{\Gamma \vdash \exists x ~ A \quad \Gamma, A \vdash B}{\Gamma \vdash B}$$

This axiom is basically modus ponens, a little prenexing, and universal elimination combined into 1 statement. Suppose I gave you:

$$\frac{\Gamma \vdash \exists x ~ A \quad \Gamma \vdash (\exists x ~ A) \implies B}{\Gamma \vdash B}$$

That is just a trivial application of modus ponens. Suppose you are also familiar with the prenex derivation:

$$(\exists x ~ A) \implies B$$ $$\lnot (\exists x ~ A) \lor B$$ $$(\forall x ~ \lnot A) \lor B$$ $$\forall x ~ (\lnot A \lor B)$$ $$\forall x ~ (A \implies B)$$

$(\exists x ~ A) \implies B ~~\vdash^\equiv~~ \forall x ~ (A \implies B)$ provides a way of converting a $\exists$ to $\forall$ by pulling it out of the condition of an implication. But it does require that $x \not \in {\rm FV}(B)$.

And suppose you accept universal elimination, that is:

$$\frac{\Gamma \vdash \forall x~A}{\Gamma \vdash A} \quad \text{for } x \not \in {\rm FV}(\Gamma)$$

Then you can put those together to get the existential elimination axiom:

$$\frac{\Gamma \vdash \exists x ~ A \quad \Gamma \vdash (\exists x ~ A) \implies B}{\Gamma \vdash B}$$

$$\frac{\Gamma \vdash \exists x ~ A \quad \Gamma \vdash \forall x ~ (A \implies B)}{\Gamma \vdash B}$$

$$\frac{\Gamma \vdash \exists x ~ A \quad \Gamma \vdash (A \implies B)}{\Gamma \vdash B}$$

$$\frac{\Gamma \vdash \exists x ~ A \quad \Gamma,~A \vdash B}{\Gamma \vdash B}$$

To correctly apply the $\exists$ conversion, you need $x \not \in {\rm FV}(B)$. To correct apply the $\forall$ elimination you need $x \not \in {\rm FV}(\Gamma)$. That doesn't establish that those conditions are necessary, but hopefully can offer some confidence about the correctness of the axiom.


If you do the foundations of logic so that variant statements are identified (in my terminology I say they are the same as formulas because they in some sense have the same form), then all the generalization rules you need are those guaranteeing that $\forall x A$ is the infimum of the set of formulas $[t/x]A$, where $t$ ranges over the set of terms. I.e., thinking of $\vdash$ as a (non-strict) smaller than relation, you need

  1. $\forall x A \vdash [t/x]A$ for all terms $t$.
  2. If $C \vdash [t/x] A$ for all terms $t$, then $C \vdash \forall xA$.

This seems a more natural way of looking at things, in my opinion. It takes me about a page or two of (pretty) proof to show that one gets the same inferences if (2) above is replaced with the standard rule

2*. If $C\vdash A$ where $x$ is not free in $C$, then $C \vdash \forall x A$.

Thus, we can still define proofs by (2*) rather than (2) (and not need to check infinitely many things to prove things).

Similarly, the standard instantiation rules can be replaced by the rules directly guaranteeing that $\exists x A$ is the supremum of the set of formulas $[t/x]A$.

In my terminology, for a statement $R$, $[t/x]R$ I take to be the formula obtained by replacing $x$ in the formula corresponding to $R$ with $t$. So by just looking at what I call formulas one can also avoid in the deepest foundation the bother of having to worry about substitutability.

I know it's not as standard, but I think it strange that anyone would want to think of $\vdash$ as anything other than a preorder on the set of formulas (fortunately one of my undergraduate professors at UNC-CH, Johann Sonner, taught me the preorder approach). It seems to me that allowing on the fundamental-definition level sets of formulas (or more than one formula) to the left of $\vdash$ introduces a high degree of awkwardness not worth whatever higher generality such might be seen as giving. Part of the problem is that $\vdash$ is pointy on the wrong side, the right-hand side, akin to most symbols suggesting a larger than relation, obscuring that it should represent a smaller than relation.

NEW: I am adding proof that my quantification rules give standard inferences. (Don't know how original this is.) I use $<$ to denote $\vdash$. My quantification rules are

IA (Infer any): $\mathbf{\underset x \wedge A < [t/x]A}$ for all terms $\mathbf{t}$.

GEN (Generalization): If $\mathbf{B < [t/x]A}$ for all terms $t$, then $\mathbf{B < \underset x \wedge A}$.

IE (Infer existence): $\mathbf{[t/x]A < \underset x \vee A}$ for all terms $\mathbf{t}$.

INS (Instantiation): If $\mathbf{[t/x]A < B}$ for all terms $\mathbf{t}$, then $\mathbf{\underset x \vee A < B}$.

I say (making up the definition) that a variable $\mathbf{x}$ is abstract in an inference system if given any $\mathbf{A < B}$ valid in the inference system and any variable $\mathbf{y}$ not in $\mathbf{A < B}$, $\mathbf{[y/x]A < [y/x]B}$ is valid in the inference system. I say an inference system is abstract if it possesses infinitely many abstract variables.

Next, I wish to demonstrate the substitution law, namely that for formulas $\mathbf{A}$ and $\mathbf{B}$, if $\mathbf{A} < \mathbf{B}$, then $[\mathbf{t}/\mathbf{y}] \mathbf{A }< [\mathbf{t}/\mathbf{y}] \mathbf{B}$ (with no restrictions on $\mathbf{t}$ other than that it be a term). If we prove this for $\mathbf{t}$ not containing $\mathbf{y}$, we shall prove it in general, for from $\mathbf{A} < \mathbf{B}$ we will have $[\mathbf{w}/\mathbf{y}] \mathbf{A }< [\mathbf{w}/\mathbf{y}] \mathbf{B}$ for $\mathbf{w}$ not in $\mathbf{t}$ or free in $\mathbf{A}$ or $\mathbf{B}$, and then the result follows from $[\mathbf{t}/\mathbf{w}] [\mathbf{w}/\mathbf{y}] \mathbf{A} < [\mathbf{t}/\mathbf{w}] [\mathbf{w}/\mathbf{y}] \mathbf{B}$. Therefore, we assume $\mathbf{t}$ does not contain $\mathbf{y}$.

Define the relation "$<_{\mathbf{t},\mathbf{y}}$"such that $\mathbf{A} <_{\mathbf{t},\mathbf{y}} \mathbf{B}$ if and only if $[\mathbf{t}/\mathbf{y}] \mathbf{A }< [\mathbf{t}/\mathbf{y}] \mathbf{B}$. We need to show that "$<_{\mathbf{t},\mathbf{y}}$" is a larger relation than "$<$", i.e., that it satisfies the inference rules. In fact, I prove a stronger result, namely that in any abstract inference system containing one of the official rules we have given, the rule obtained by replacing in an official rule the relation "$<$" with "$<_{\mathbf{t},\mathbf{y}}$" holds merely as a result of the official rule holding and that the inference system is abstract. Again, the only difficulties are with those inference rules involving generalization and instantiation. I shall deal here only with generalization, leaving the entirely analogous proof in the case of instantiation to you.

As for IA, we need to show that for any formula $\mathbf{A}$ that $[\mathbf{t}/\mathbf{y}] \underset{\mathbf{x}} \wedge \mathbf{A} < [\mathbf{t}/\mathbf{y}] [\mathbf{s}/\mathbf{x}] \mathbf{A}$ for all terms $\mathbf{s}$. The result is easy if $\mathbf{x}$ is the same variable as $\mathbf{y}$, so we assume they are different. Letting $\mathbf{z}$ be a variable distinct from $\mathbf{y}$ and not occurring in $\mathbf{t}$ or free in $\mathbf{A}$, it is clear from the dummy variable rule that our desired inference is identical to $[\mathbf{t}/\mathbf{y}] \underset{\mathbf{z}}\wedge [\mathbf{z}/\mathbf{x}] \mathbf{A} < [\mathbf{t}/\mathbf{y}] [\mathbf{s}/\mathbf{z}] [\mathbf{z}/\mathbf{x}] \mathbf{A}\,.$ In other words, without loss of generality we may assume also that $\mathbf{x}$ does not occur in $\mathbf{t}$. Making this assumption, $[\mathbf{t}/\mathbf{y}] \underset{\mathbf{x}}\wedge \mathbf{A}$ is the same formula as $\underset{\mathbf{x}} \wedge [\mathbf{t}/\mathbf{y}] \mathbf{A}$, and $[\mathbf{t}/\mathbf{y}] [\mathbf{s}/\mathbf{x}] \mathbf{A}$ is the same formula as $[[\mathbf{t}/\mathbf{y}] \mathbf{s}/\mathbf{x}] [\mathbf{t}/\mathbf{y}] \mathbf{A}$, and we make use of IA holding for "$<$".

The situation for GEN is more complex than that for the other cases (except for the entirely analogous INS). Apparently, one needs there not just that the inference rule holds, but also that the inference rules as a whole form an inference system which is abstract.

As for GEN, suppose that $[\mathbf{t}/\mathbf{y}] \mathbf{B} < [\mathbf{t}/\mathbf{y}] [\mathbf{s}/\mathbf{x}] \mathbf{A}$ for all terms $\mathbf{s}$. Choose $\mathbf{z}$ such that $\mathbf{z}$ is distinct from $\mathbf{y}$, and such that $\mathbf{z}$ is not free in $\mathbf{A}$,$\mathbf{B}$ or $\mathbf{t}$. Let $\mathbf{D}$ be $[\mathbf{z}/\mathbf{x}] \mathbf{A}$. Then $[\mathbf{t}/\mathbf{y}] \mathbf{B} < [\mathbf{t}/\mathbf{y}] [\mathbf{s}/\mathbf{z}] \mathbf{D}$ for all terms $\mathbf{s}$. Fix $\mathbf{s}$ and let $\mathbf{w}$ be an abstract variable distinct from both $\mathbf{y}$ and $\mathbf{z}$ and not occurring in $\mathbf{B}$, $\mathbf{t}$, $\mathbf{D}$, or $\mathbf{s}$. Then in particular, $[\mathbf{t}/\mathbf{y}] \mathbf{B} < [\mathbf{t}/\mathbf{y}] \left[\left[\mathbf{w}/\mathbf{y}\right]\mathbf{s}/\mathbf{z}\right] \mathbf{D}$. By a substitution lemma, this is the same inference as $\tag*{$(\dagger)$}\label{eq:sub}[\mathbf{t/y}] \mathbf{B} < [[\mathbf{w}/\mathbf{y}]\mathbf{s}/\mathbf{z}][\mathbf{t}/\mathbf{y}] \mathbf{D}.$ Since $\mathbf{w}$ (like all variables in the inference system under consideration) is an abstract variable, the above inference, which does not contain $\mathbf{y}$, remains valid when $\mathbf{w}$ is replaced by $\mathbf{y}$. We have \begin{align*}[\mathbf{t}/\mathbf{y}] \mathbf{B} = [\mathbf{y}/\mathbf{w}][\mathbf{t}/\mathbf{y}]\mathbf{B} < [\mathbf{y}/\mathbf{w}][[\mathbf{w}/\mathbf{y}]\mathbf{s}/\mathbf{z}][\mathbf{t}/\mathbf{y}] \mathbf{D} &= [[\mathbf{y}/\mathbf{w}][\mathbf{w}/\mathbf{y}]\mathbf{s}/\mathbf{z}][\mathbf{t}/\mathbf{y}] \mathbf{D}\\ &= [\mathbf{s}/\mathbf{z}][\mathbf{t}/\mathbf{y}]\mathbf{D}. \end{align*} Since $\mathbf{s}$ was arbitrary, $[\mathbf{t/y}] \mathbf{B} < [\mathbf{s}/\mathbf{z}][\mathbf{t}/\mathbf{y}] \mathbf{D}$ for all terms $\mathbf{s}$. From GEN applied to "$<$" it follows that $[\mathbf{t/y}] \mathbf{B} < \underset{\mathbf{z}}\wedge [\mathbf{t}/\mathbf{y}] \mathbf{D}$, i.e., $[\mathbf{t/y}] \mathbf{B} < \underset{\mathbf{z}}\wedge [\mathbf{t}/\mathbf{y}] [\mathbf{z}/\mathbf{x}] \mathbf{A}$. Thus it suffices to show that $\underset{\mathbf{z}}\wedge [\mathbf{t}/\mathbf{y}] [\mathbf{z}/\mathbf{x}] \mathbf{A}$ is verily equivalent to $[\mathbf{t}/\mathbf{y}] \underset{\mathbf{x}}\wedge \mathbf{A}$. In fact, they are the same formulas. Indeed, by the dummy variable rule, $[\mathbf{t}/\mathbf{y}] \underset{\mathbf{x}}\wedge \mathbf{A}$ is the same formula as $[\mathbf{t}/\mathbf{y}] \underset{\mathbf{z}}\wedge [\mathbf{z}/\mathbf{x}] \mathbf{A}$, which is the same formula as $\underset{\mathbf{z}}\wedge [\mathbf{t}/\mathbf{y}] [\mathbf{z}/\mathbf{x}] \mathbf{A}$ since $\mathbf{z}$ does not occur in $\mathbf{t}$ and is distinct from $\mathbf{y}$.

To reiterate, we have shown the following.

The Substitution Law. For all formulas $\mathbf{A}$ and $\mathbf{B}$, terms $\mathbf{t}$ and variables $\mathbf{x}$, if $\mathbf{A} < \mathbf{B}$ then $[\mathbf{t}/\mathbf{x}] \mathbf{A} < [\mathbf{t}/\mathbf{x}] \mathbf{B}$.

Note that the substitution law immediately implies

GEN*. If $\mathbf{B} < \mathbf{A}$ and $\mathbf{x}$ is not free in $\mathbf{B}$, then $\mathbf{B} < \underset{\mathbf{x}}\wedge \mathbf{A}$.

INS*. If $\mathbf{A} < \mathbf{B}$ and $\mathbf{x}$ is not free in $\mathbf{B}$, then $\underset{\mathbf{x}} \vee \mathbf{A} < \mathbf{B}$.

Moreover, if we drop GEN and INS from our list of rules of inference, and replace them by GEN* and INS*, then GEN and INS will follow, and so replacing GEN and INS by their starred counterparts gives an equivalent inference relation "$<<$" identical to "$<$". Indeed, looking at generalization (instantiation is similar), suppose $\mathbf{B} << \mathbf{A}[\mathbf{t}/\mathbf{x}]$ for all terms $\mathbf{t}$. Then $\mathbf{B} << \mathbf{A}[\mathbf{z}/\mathbf{x}]$, where $\mathbf{z}$ is a variable not free in $\mathbf{A}$ or $\mathbf{B}$. Hence, $\mathbf{B} << \underset{\mathbf{z}}\wedge \mathbf{A}[\mathbf{z}/\mathbf{x}]$, and so $\mathbf{B} << \underset{\mathbf{x}}\wedge \mathbf{A}$ by the dummy variable rule. This alternative set of inference rules is interesting because it is easy to show using them that the set of valid inferences are just those inferences that can be obtained by (finite) proofs, i.e., lists of inferences of the form $\mathbf{A} < \mathbf{B}$ in which each entry of the list can be obtained from prior entries using an inference rule (with GEN and INS replaced by their starred counterparts). Thus, although at first glance our original inference rules may appear to (undesirably) involve infinite methods (e.g., checking whether a formula $\mathbf{B}$ is such that $\mathbf{B} < \mathbf{A}[\mathbf{t}/\mathbf{x}]$ holds for all terms $\mathbf{t}$), which could make the notion of proof obscure and not finitary, they really do not in any unavoidable way.