When do free variables occur? Why allow them? What is the intuition behind them?
Solution 1:
Let's be clear. You don't need to countenance free variables to set up standard first order logic. [Thus it in fact isn't right to say e.g. "free variables are needed in any case to cope with some trivial syntactic matters, namely building up quantified formulas in an inductive fashion".]
As far as the syntax goes, we needn't allow wffs with free variables. We can have the rule that if $\varphi(a)$ is a wff containing one or more occurrences of the constant $a$, then so is $\forall \xi\varphi(\xi)$ and $\exists \xi \varphi(\xi)$, where $\xi$ is any variable and $\varphi(\xi)$ is of course the result of replacing each occurrence of $a$ in $\varphi(a)$ with $\xi$. Or to look at it another way, we can think of quantification not as a matter of just putting $\forall \xi$ or $\exists \xi$ in front of a wff with free variables, but rather as applying the single operator $\forall \xi \ldots \xi \ldots $to the predicate $\varphi(\cdot)$.
As far as the semantics goes, the rule is that $\forall \xi\varphi(\xi)$ is true on an interpretation $\mathcal{I}$ just so long as every way of extending $\mathcal{I}$ to assign an interpretation to the new constant $a$ makes $\varphi(a)$ true -- and likewise $\exists \xi\varphi(\xi)$ is true on an interpretation $\mathcal{I}$ just so long there is some way of extending $\mathcal{I}$ to assign an interpretation to the new constant $a$ which makes $\varphi(a)$ true. (Careful! Note this is NOT a substitutional account of quantification: it is NOT the account that goes $\forall \xi\varphi(\xi)$ is true on $\mathcal{I}$ just if $\varphi(a)$ is true on $\mathcal{I}$ for every constant $a$. Our account of the semantics talks about extensions of $\mathcal{I}$ by assigning references to new constants, which is in the same ball park as talking about extending $\mathcal{I}$ with assignments to variables).
So we can (and some authors do) treat both syntax and semantics without allowing wffs with free variables. Instead we can use "parameters", i.e. additional constants (whose interpretation is not initially fixed). I think there are principled reasons for preferring this approach, but don't need to argue that here: the present point is that it is certainly possible to do things this way.
And seeing that this is so should help understand what is going on when -- rather unfortunately perhaps -- logic texts (always tempted by Bauhaus austerity!) decide not to use new constants or parameters which are typographically distinguished from variables, but prefer (on the grounds of economy) to re-use the same letters both for 'real' variables tying quantifiers to slots in predicates and for 'free' variables serving the role of parameters. This doubling up of roles for your $x$s and $y$s isn't necessary as we have seen, even if it is probably the majority practice (and also reflects some informal mathematical practice). And it can help -- if this doubling seems confusing -- to first tell the syntactical and semantic stories the other way, distinguishing variables and parameters, and then see the doubled-up role for your $x$s and $y$s as just a typographically economical trick.
Solution 2:
First, some variables might be not free, even if they appear free, e.g. those might be bound by context or by some implicit quantifiers due to convention. For example:
- Let $\lambda$ be the constant from the question formulation. Then, $$\exists x.\ f(x,\lambda) = 0.$$ In the last formula $\lambda$ might appear free, but in reality comes from the context (which is quite short, so the example is kind of stupid, but nevermind).
- The commutativity axiom $$x + y = y + x.$$ In the usual formulation the $x$ and $y$ might appear free, but the convention in algebra is that free variables are quantified universally, i.e. the real formal statement is $$\forall x. \forall y.\ x + y = y + x.$$ Of course this is long and tedious, so noone bothers to write those quantifiers.
- [During some proof...] Suppose that we got the solution $x$ by magic, so we know that $$\forall y.\ F(y,x) = 0, \text{ now observe that}\ldots$$ Here, $x$ might appear free, but it is not, only the proof is presented in a unconventional order (to make it easier, e.g. give more intuition what is happening) and proper quantification of $x$ is introduced later. If we were to make it formal (and perhaps less understandable), then $x$ would be just another bound variable.
- The derivative of a square: $$\Big(x^2\Big)' = 2x.$$ Here $x$ appears free, but it is a notation issue; here $x^2$ denotes a function, namely $(x\mapsto x^2)$, only that convention makes it implicit that $x$ is in fact a parameter. In other words this could be $(x\mapsto x^2)' = x\mapsto 2x$, only it makes things much less clear.
- In logic Skolem constants may look like variables, but are not, i.e. those are constants or functions or some other weird objects, but not free variables.
On the other hand, sometimes there are free variables, to name some examples
- In expression rewriting, i.e. systems which explicitly manipulate expressions. For example see denotational semantics.
- In logic systems, such as Prolog. In fact a free variable have a slightly different meaning there, but it is still related.
- In lambda calculus.
The most important part is, that when real free variables do occur, then these have some specific meaning defined in the context/system. Without that meaning (it could be defined, or perhaps implied by context) free variables make the expression meaningless – it is an error, they should not appear. In fact, this might be some basic technique for checking if your expression is correct (e.g. no quantified information should leak outside of that quantifier scope).
I hope this helps $\ddot\smile$
Solution 3:
Irrespective of how the semantics (of some first-order language) is spelled out, free variables are needed in any case to cope with some trivial syntactic matters, namely building up quantified formulas in an inductive fashion (as long as quantifiers are treated as variable binding operators).
More interesting is the question: Why allow the semantics to interpret formulas containing free variables (open formulas)? (Call a semantics allowing such interpretations Tarski style semantics.) I think that a preference for Tarski style semantics rests on intuitions about how to treat quantifications. If you want something like $\forall x \phi$ to be true just in case everything satisfies $\phi$ and at the same time allow that not every object has a name, then a Tarski style semantics should be your first choice.
If, on the other hand, you think that $\forall x \phi$ is true precisely $\phi [a, x]$ is true for every name $a$ (where $\phi[a, x]$ results from $\phi$ by substituting the free occurences of $x$ in $\phi$ by $a$) you may opt against such a semantics.
So which of these intuitions is correct? Both treatments are expressively equivalent. Some say that the latter is more elegant since it avoids machinery the former needs (interpretations of variables and the coincidence lemma). On the other hand Tarski style semantics invites a natutal generalization of the first-order quantifiers resulting in so-called Lindström quantifiers (higher-order relations), which are useful for questions of axiomatizability. See also the thread initiated by Peter Smith's post: Two styles of semantics for a first-order language: what's to choose?