How do proof verifiers work?

I'm currently trying to understand the concepts and theory behind some of the common proof verifiers out there, but am not quite sure on the exact nature and construction of the sort of systems/proof calculi they use. Are they essentially based on higher-order logics that use Henkin semantics, or is there something more to it? As I understand, extending Henkin semantics to higher-order logic does not render the formal system any less sound, though I am not too clear on that.

Though I'm mainly looking for a general answer with useful examples, here are a few specific questions:

  • What exactly is the role of type theory in creating higher-order logics? Same goes with category theory/model theory, which I believe is an alternative.
  • Is extending a) natural deduction, b) sequent calculus, or c) some other formal system the best way to go for creating higher order logics?
  • Where does typed lambda calculus come into proof verification?
  • Are there any other approaches than higher order logic to proof verification?
  • What are the limitations/shortcomings of existing proof verification systems (see below)?

The Wikipedia pages on proof verification programs such as HOL Light Coq, and Metamath give some idea, but these pages contain limited/unclear information, and there are rather few specific high-level resources elsewhere. There are so many variations on formal logics/systems used in proof theory that I'm not sure quite what the base ideas of these systems are - what is required or optimal and what is open to experimentation.

Perhaps a good way of answering this, certainly one I would appreciate, would be a brief guide (albeit with some technical detail/specifics) on how one might go about generating a complete proof calculus (proof verification system) from scratch? Any other information in the form of explanations and examples would be great too, however.


I'll answer just part of your question: I think the other parts will become clearer based on this.

A proof verifier is essentially a program that takes one argument, a proof representation, and checks that this is properly constructed, and says OK if it is, and either fails silently otherwise, or highlights what is invalid otherwise.

In principle, the proof representation could just be a sequence of formulae in a Hilbert system: all logics (at least, first-orderisable logics) can be represented in such a way. You don't even need to say which rule is specified at each step, since it is decidable whether any formula follows by a rule application from earlier formulae.

In practice, though, the proof representations are more complex. Metamath is rather close to Hilbert systems, but has a rich set of rules. Coq and LF use (different) typed lambda calculi with definitions to represent the steps, which are computationally quite expensive to check (IIRC, both are PSPACE hard). And the proof verifier can do much more: Coq allows ML programs to be extracted from proofs.


I don't think the people working in higher-order theorem proving really care about Henkin semantics or models in general, they mostly work with their proof calculi. As long as there are no contradictions or other counterintuitive theorems they are happy. The most important and most difficult theorem they prove is usually that their proof terms terminate, which IIRC can be viewed as a form of soundness.

Henkin semantics is most interesting for people trying to extend their first-order methods to higher-order logic, because it behaves essentially like models of first-order logic. Henkin semantics is somewhat weaker than what you would get with standard set-theoretic semantics, which by Gödels incompleteness theorem can't have a complete proof calculus. I think type theories should lie somewhere in between Henkin and standard semantics.

Where does typed lambda calculus come into proof verification?

To prove some implication P(x) --> Q(x) with some free variables x you need to map any proof of P(x) to a proof of Q(x). Syntactically a map (a function) can be represented as a lambda term.

Are there any other approaches than higher order logic to proof verification?

You can also verify proofs in first-order or any other logic, but then you would loose much of the power of the logic. First-order logic is mostly interesting because it is possible to automatically find proofs, if they are not too complicated. The same applies even more to propositional logic.

What are the limitations/shortcomings of existing proof verification systems (see below)?

The more powerful the logic becomes the harder it becomes to construct proofs.

Since the systems are freely available I suggest you play with them, e.g. Isabelle and Coq for a start.


ML and the like were designed as (meta) programming languages to support the creation of backward chaining and tactics. But proof verification is usually forward chaining and a quite trivial task assuming the proof has already been formally presented, since formalization and mechanical verfiability are usually pleonastic.

In the simplest you can do it with Prolog. Here is an example. Assume you are working with a deduction system based on single consequent sequents. And you have the following two inference rules:

 -------- (id)
 A |- A

 G |- forall x: A(x)
 ------------------- (inst)
 G |- A(t)

Then you can code these rules as Prolog predicates as follows:

% id(+Prop, -Seq)
id(A,(A :- A)).

% inst(+Seq, +Term, -Seq)
inst((G :- forall(X^A)), T, (G :- B)) :- call(X^(A=B),T).

Now assume you want to verify the following proof:

--------------------------------- (Id)
forall x: p(x) |- forall x : p(x)
--------------------------------- (inst)
forall x: p(x) |- p(a)

Then you simply run the following Prolog query:

?- id(forall(X^p(X)),S), inst(S,a,R).
R = (forall(X^p(X)) :- p(a)).

A reason to use types here would be their capability to model the proven theorems and that their construction mimics the construction of the proof.

Here we didn't use types, so we had to manually furnish the parameters of the proof, i.e. forall(X^p(X)) for the id predicate and a for the inst predicate in the example above.

But types and their construction would give you an integral representation which would be also very well amenable to Prolog, but I didn't show it here.

Bye

P.S.: The above assumes some higher order system predicates, namely call/2 and ^/3. Meanwhile many Prolog systems have call/2, but you might need to define ^/3 by yourself.

You can do it as follows:

^(X,A,Y) :- term_variables(A,L), remove_variable(L,X,R),
    copy_term(R-X-A,R-Y-B), call(B).

remove_variable([],_,[]).
remove_variable([X|Y],Z,Y) :- X==Z, !.
remove_variable([X|Y],Z,[X|T]) :- remove_variable(Y,Z,T).

The above is sound as long T is not higher order, and as long as you use separated bound variables which are also distinct from globally free variables.

It also makes use of term_variables/2 which is more or less standard now in many Prolog system. For more information about call/2 and term_variables/2 see for example: http://www.complang.tuwien.ac.at/ulrich/iso-prolog/dtc2