Is there a connection between length of sentence and length of proof?

My basic question is: "Do longer tautologies take longer to prove?" But obviously this is underdetermined. If you are allowed an inference rule "Tautological Implication" then any tautology has a 1 line proof.

But let's say we're working in natural deduction, is it true that longer tautologies (tautologies with more letters and connectives in them) take more proof steps?

But this is still not quite a good enough question, since we can add superfluous steps to a proof and still prove the result. So the question is:

In Natural Deduction, is the shortest proof of a tautology related to its length? Or perhaps: does a shorter tautology always admit a shorter proof?

And how robust is this result across different sets of connectives/inference rules?


Solution 1:

This question is one of the main topics studied in proof complexity. The measure you are talking about is called the number of steps in the proof or proof-length.

The size of a tautology is related to the number of steps of its shortest proof, but this does not mean that the number of steps is a monotone function of size of tautology. As mentioned by others, the length of a long tautology can be very short. Coming up with arbitrary long tautologies having proofs with constant (i.e. independent from the size of tautology) number of steps is not difficult.

On the other hand, the task of coming up with tautologies with large number of steps is not obvious and is open for many propositional proof systems. Note that it follows from the (constructive) proof of completeness theorem that every tautology of size $s$ has a proof of height $O(s)$ with $O(2^s)$ steps and size $O(s2^s)$.

We have some very weak lowerbounds for maximum number of steps needed for a tautology of size $s$ and the question that this upperbound is tight is wide open for sequent calculus and also for natural deduction which are essentially equivalent to Hilbert style proof systems. The bound is tight if we remove cut rule from sequent calculus, i.e. there are tautologies which require exponential number of steps.

You can find out more about the relation between different measure studied in proof complexity and the relation between different propositional proof systems by checking surveys/books on the topic.

For first order logic, note that the first order logic is undecidable, i.e. given a formula there is no algorithm to check if it is a valid formula. Any computable upperbound on the proof size would give an algorithm for deciding first order logic, check all possible proofs up to that size. (Note that an upper bound on proof length in natural deduction/sequent calculus will give an upper bound on the size because of normalization/cut elimination). Since there is no such algorithm, there cannot be any computable upperbound.

Solution 2:

The size of proofs is almost totally unrelated to the size of the statements they prove (in the worst case).

Lower Bound: constant size

The size of the smallest proof for statements of size N is constant. Just make your statements of the form "True or S". (Unless you count the input statement as part of the size, in which case it is linear.)

Upper Bound: incomputable size

The size of the largest proof proving a statement of size less than N grows so fast it is not computable from N.

Proof

Consider the size of a proof that a Turing Machine does not halt within S steps. Some turing machines run forever but can't be proven to run forever. Suppose you have such a machine M. Proving M doesn't halt before S steps must take more steps as S grows. Otherwise we have a proof for all S and have contradicted ourselves.

Lets call that number-of-steps to proof-size function G. I am going to make one assumption, which is that there is a computable function G' such that G(G'(X)) >= X.

Suppose you have some computable function F you claim is an asymptotic upper bound on proof sizes given the statement size. Then I just generate a sequence of statements where the Nth statement is "M does not halt within G'(N*F(N)) steps". The statements are all provable (worst case: simulate M in the proof; that's why I assumed G' is computable) and grow in size logarithmically. But the proof sizes grow asymptotically faster than F, which is exponentially faster than F with respect to the statement sizes. So F is not actually an asymptotic upper bound.

Therefore no computable function is an asymptotic upper bound on proof size given the statement size, which means the upper bound is incomputable.

Solution 3:

does a shorter tautology always admit a shorter proof?

it seems like there should be a tautology whose statement is longer than fermats last theorem but whose shortest proof is shorter