Proof of $x=y \rightarrow [P(x) \rightarrow P(y)]$

I'm having a hard time with the proof $x=y \rightarrow [P(x) \rightarrow P(y)]$. I know that the generic steps are: (1) Given $x=y$. (2) Assume $P(x)$. (3) Since $x=y$ and $P(x)$, then, $P(y)$. Nevertheless, I would like a more detailed explanation on step 3. How can we infer that $P(y)$ derives from $x=y$ and $P(x)$, if this is exactly what we're trying to prove beforehand? As it is formulated, it seems like a self-refered proof.

All my attempts to overcome this apparent self-reference have failed. If, for example, we interpret $P(x)$ as logically equivalent to some truth set $X=\{x_0, x_1... x_n\}$, then $P(x)$ would mean $x\in\{x_0, x_1... x_n\}$, which is the same as $x=x_0 \lor x=x_1 \lor \ldots x=x_n$. Since $x=y$, then $y=x_0 \lor y=x_1 \lor \ldots y=x_n$, which is $P(y)$. However, this last inference also applies the implication $x=y \rightarrow [P(x) \rightarrow P(y)]$.


Solution 1:

As an example, I use the natural deduction system for first-order logic defined here: https://leanprover.github.io/logic_and_proof/index.html

See Section 3 "Natural Deduction for Propositional Logic", and Section 8 "Natural Deduction for First Order Logic".

Here is the proof: $$\dfrac{\dfrac{\dfrac{\dfrac{}{x=y}1 \quad \dfrac{}{P(x)}2}{P(y)}\mathsf{sub}}{P(x)\to P(y)}2\to I}{(x=y)\to (P(x)\to P(y))}1\to I$$

Solution 2:

How can we infer that $P(y)$ derives from $x=y$ and $P(x)$, if this is exactly what we're trying to prove beforehand? As it is formulated, it seems like a self-refered proof.

You are asking why it is that if $P(x)$ and $x=y$, then $P(y)$. Well, let's think about it: Suppose I tell you that Bob is 6 feet tall. I am also telling you that Bob is the brother of Jim. What can you now infer? That the brother of Jim is 6 feet tall.

In general: if I know that $x$ has some property, and I also know that $x$ and $y$ are the same object, then $y$ has that same property. Really, that's all there is to it. If you question that, I really don't know what to say. But any rational person would agree to the truth of that, and the rule simply formalizes it.

In fact, there is an even deeper 'circularity' here: How do you prove $P \to Q$? Well, you assume $P$, and then infer $Q$. So, you effectively show that 'if $P$ then $Q$' in order to show that $P \to Q$

However, despite appearances, this is not circular. Remember that a proof system is really just a purely syntactical system: it derives certain symbols strings from other symbol strings. Of course, the symbol strings can be given a semantics, and it is in that sense that the proofs are actually proofs about something. The basic idea of a formal proof, then, is to purely formalize a proof that, in some sense, already semantically exists. Indeed, most of the times that you make a formal proof, you already have a semantical/informal proof: the formal proof is to put that proof in a special language. That's the 'circularity' that you are perceiving.

You should really read 'What the Tortoise said to Achilles'. In it, the Tortoise is, like you, asking why some elementary inference would be the case. And, yes, trying to argue for it requires logic itself, so that seems circular, and if you insist to spell out the logic in terms of more logic, you do indeed end up in an infinite circular regression. But again, all that formal logic does is to formalize an elementary principle that reflects how we think and talk about the world.