A subset interpretation (with context morphisms) of the sequent calculus for predicate logic

The previous two posts used the sequent calculus with a subset interpretation:

We work with sequents A_{1},\ldots,A_{r}\vdash B_{1},\ldots,B_{s}, and interpret the propositions A_i (and B_j) as subsets of some universe set X. We interpret the sequent itself as [A_{1}]\cap\ldots\cap [A_{r}]\ \subseteq\ [B_{1}]\cup\ldots\cup [B_{s}].

While writing the previous post, there was the temptation to introduce the sequent calculus rules for existential and universal quantification. But the subset interpretation with a fixed universe set X didn’t seem helpful for free variables and quantification. But what if we allow different universe sets for different sequents? One way would be to augment the sequent rules to change the universe sets in a well defined way. However, the cut rule and two of the four quantification rules (those that replace an arbitrary term by a quantified variable) are a bit problematic. For them, the universe can sometimes be shrunk down, since a formula or a term was eliminated. Therefore, we just add two separate rules for changing the universe, and apply the sequent rules themselves only with respect to a fixed universe set.

This is the last planed post on the sequent calculus for the moment. (There might be an additional post in this series, extending the subset interpretation to natural deduction.) Since this post will remain vague and unsatisfactory in many ways, a list of related online materials on the sequent calculus seems appropriate: I often consulted the wikipedia article while writing these posts. (I never consulted Sequent Calculus Primer, but it looks nice and has exercises.) This MathOverflow question on introductions to sequent calculus suggests the chapter on Sequent Calculus from Frank Pfenning’s manuscript on Automated Theorem Proving, which I really liked. It also suggests Proofs and Types by Jean-Yves Girard, which was my first introduction to the sequent calculus. It left me unsatisfied, and I agree with Charles Stewart:

A warning: Girard’s style is a little slippery, and it is common for students to say they have read it, who turn out to have absorbed the opinions but little of the results.

I am still slightly ashamed that one of my first questions – when I met a group of academic logicians (with a focus on proof assistants) – was whether they found Girard easy to understand. Girard claims that the sequent calculus is generally ignored by computer scientists, an exception being Logic For Computer Science Foundations of Automatic Theorem Proving by Jean H. Gallier. A nice text, but who has time to read those 500+ pages? This is one reason why I like An Introduction to Proof Theory by Samuel R. Buss.

The sequent calculus rules for predicate logic

We are interested in normal first order predicate logic with predicates and functions. Here a signature with a finite number of predicate and function symbols is given, which include the arity for each of the symbols. The sequent calculus rules for quantification are:

Left quantification rules Right quantification rules
\begin{array}{c} \Gamma,A[t/x]\vdash\Delta\\ \hline \Gamma,\forall xA\vdash \Delta \end{array}(\forall L) \begin{array}{c} \Gamma\vdash A[t/x],\Delta\\ \hline \Gamma\vdash \exists xA,\Delta \end{array}(\exists R)
\begin{array}{c} \Gamma,A[y/x]\vdash\Delta\\ \hline \Gamma,\exists xA\vdash \Delta \end{array}(\exists L) \begin{array}{c} \Gamma\vdash A[y/x],\Delta\\ \hline \Gamma\vdash \forall xA,\Delta \end{array}(\forall R)

Those rules look simple, but what do they mean? What is the difference between t and y? And how to interpret a sequent with formulas containing free variables? First, y is a free variable not occurring in \Gamma, \Delta, or A. And A[y/x] means that each free occurrence of x in A is replaced by y. Next, t is an arbitrary term, with no restrictions at all on the variables occurring in it. A term is just any reasonable combination of function symbols (from the signature) and variables, like f(x,g(f(y,x),y)) where both f and g are binary function symbols.

There are many possible interpretations of predicate logic, but that answer just avoids answering the question. The interpretation of a formula or sequent with free variables can be controversial. My previous effort at a subset based interpretations of predicate logic compatible with non-classical logic left out equality, functions and constants. Even so we are talking only about classical logic here, the goal of the following interpretation is to be also applicable more generally. That interpretation will follow naturally once we clarify which models we want to allow.

Models and interpretations of predicate calculus

For the typical interpretations, a model consists of a set U (which is the universe where the variables live), a subset of U^k for each k-ary predicate symbol, and a function f:U^k \to U for each k-ary function symbol. A straightforward interpretation in that context is that a formula with k free variables is a subset of U^k, and a formula without free variables (i.e. a sentence) is either true or false. And for any set of k variables, we can interpret any formula whose free variables are included in that set as a subset of U^k. A sequent in this interpretation is true, iff for the subsets formed by the formulas interpreted relative to the set of all free variables in the sequent, the intersection of the subsets on the left side is contained in the union of the subsets on the right side.

This straightforward interpretation above is already a subset interpretation, but we want a slightly more general subset interpretation. For propositional logic, the possible models were given by a set X and a subset of X for each propositional variable A_i. In predicate logic, the 0-ary predicates take the role of the propositional variables, so we want them to be subsets of X. For our generalised interpretation, a model consists of a set X (whose subsets are the truth values) and a set U (which is the universe where the variables live), a subset of X \times U^k for each k-ary predicate symbol, and a function f:X \times U^k \to U for each k-ary function symbol.

The way X will enter into the model is to have some implicit context \xi\in X such that formulas like P(x, f(y, z), z) in the formal syntax become P(\xi, x, f(\xi, y, z), z) in the model. So for any set of k free variables, we can interpret any formula whose free variables are included in that set as a subset of X\times U^k. For a given set of variables and a sequent for which the free variables of the formulas are contained in that subset, the a sequent is true, iff for the subsets formed by the formulas interpreted relative that set of variables, the intersection of the subsets on the left side is contained in the union of the subsets on the right side.

(If we have a formula like x+y = y+x, we would like to interpret it as a subset of U^2, so we normally use X=\{()\} and then just omit it. Here the element () is called unit, the set \{()\} represents truth, and the empty set \emptyset represents falsehood.)

Context morphisms give life to the subset interpretation

If we have (a sequent of) subsets A_i over X, then a (partial) function p : Y \to X from Y to X gives us (a sequent of) subsets over Y by taking inverse images p^{-1}(A_i) of the subsets. We have

  • A \subseteq B \Rightarrow p^{-1}(A) \subseteq p^{-1}(B)
  • p^{-1}(A\cap B)=p^{-1}(A)\cap p^{-1}(B)
  • p^{-1}(A\cup B)=p^{-1}(A)\cup p^{-1}(B)

and hence

[A_{1}]\cap\ldots\cap [A_{r}]\ \subseteq\ [B_{1}]\cup\ldots\cup [B_{s}]
\Downarrow
p^{-1}([A_{1}])\cap\ldots\cap p^{-1}([A_{r}])\ \subseteq\ p^{-1}([B_{1}])\cup\ldots\cup p^{-1}([B_{s}])

The direct translation into the language of sequences reads

\begin{array}{c} A_{1}(\xi),\ldots,A_{r}(\xi)\vdash B_{1}(\xi),\ldots,B_{s}(\xi)\\ \hline A_{1}(p(\nu)),\ldots,A_{r}(p(\nu))\vdash B_{1}(p(\nu)),\ldots,B_{s}(p(\nu)) \end{array}

This just tells us that global substitution is unproblematic. (But r\geq 1 is required in case p is a partial function, which is easily missed.) However, the more typical function p for our purpose is similar to p(u,v,w)=(u,v), i.e. a function from Y=U^3 to X=U^2. It is typical, since the expressions grow in complexity for most rules of sequent calculus. Here is an example:

\begin{array}{c} \begin{array}{l}A\vdash A:\{a\}\\ \hline A\vdash A:\{(a,b)\} \end{array} \qquad \begin{array}{l}B\vdash B:\{b\}\\ \hline B\vdash B:\{(a,b)\}\end{array}\\ \hline A,B\vdash A\land B:\{(a,b)\}\end{array}

This example also shows, how we can separately change the universe set, and apply the sequent rules only with respect to a fixed universe set. After applying the (\exists L) or (\forall R) rule, we can shrink the universe set. To illustrate, let us first apply the (\forall L) and (\exists R) rule, where we cannot always shrink the universe set:

\begin{array}{r} P(x) \vdash P(x) : X\times U\\ \hline \forall xP(x) \vdash P(x) : X\times U \end{array}

Here we cannot shrink the universe set.

\begin{array}{l} \forall xP(x) \vdash P(x) : X\times U\\ \hline \forall xP(x) \vdash \exists x P(x) : X\times U \end{array}

Now we can shrink the universe set, but only after checking that the sequent no longer depends on x. But when we apply (\forall R), we can always shrink the universe set:

\begin{array}{l} \forall xP(x) \vdash P(x) : X\times U\\ \hline \forall xP(x) \vdash \forall x P(x) : X\times U\\ \hline \forall xP(x) \vdash \forall x P(x) : X \end{array}

How can we justify this shrinking in terms of images or inverse images? The inverse image under p:X \to X\times U, \xi \to (\xi, c(\xi)) for an arbitrary function (constant) c(\xi) could be a justification. But we don’t really write [\forall xP(x)](c) \vdash [\forall x P(x)](c), and c is arbitrary and irrelevant. So the image under F:X\times U \to X, (\xi, u) \to \xi might be a nicer justification. But we only have

  • A \subseteq B \Rightarrow F(A) \subseteq F(B)
  • F(A\cap B)\subseteq F(A)\cap F(B)
  • F(A\cup B)=F(A)\cup F(B)

However, if we assume F^{-1}(F([A_i]))=[A_i], then F([A_i]\cap\dots\cap [A_r])=F([A_i])\cap\dots\cap F([A_r]) and hence

[A_{1}]\cap\ldots\cap [A_{r}]\ \subseteq\ [B_{1}]\cup\ldots\cup [B_{s}]
\Downarrow
F([A_{1}])\cap\ldots\cap F([A_{r}])\ \subseteq\ F([B_{1}])\cup\ldots\cup F([B_{s}])

Doubts about context morphisms and substitutions

Let us apply the above to propositional logic. Surely, from A,B\vdash A\land B we can deduce C\lor D,C\to D\vdash (C\lor D)\land (C\to D). But we cannot justify it from A,B\vdash A\land B:\{(a,b)\} in terms of inverse images. Can we justify it from A,B\vdash A\land B:X_a\times X_b? Yes, take the map p:X_c\times X_d\to X_a\times X_b with (c,d) \to ([C\lor D](c,d), [C\to D](c,d)). Here X_c and X_d can be arbitrary and X_a=X_b=\{0,1\}. Note that we only need |X_a|, |X_b| \geq 2.

How justified are those substitutions into the expressions? Here, we used

[A\land B]([C\lor D](c,d), [C\to D](c,d))=[(C\lor D)\land (C\to D)](c,d)

Can we justify this? Can such substitutions go wrong? Yes, if p is a partial function:

\begin{array}{c} \top\vdash B(x)\\ \hline [\top](p(y))\vdash [B(\cdot)](p(y))\\ \hline [\top](y)\vdash [B(p(\cdot))](y)\\ \hline \top\vdash B(p(y)) \end{array}

Here [\top](p(y)) = [\top](y) fails, but it would work if p were a total function. We could have omitted \top, but then it would fail because of r=0. That was actually the point were I noticed that the restriction r\geq 1 is required for partial functions.

The substitutions are justified, because they work for each operation individually:

  • [\bot](p(y)) = [\bot](y)
  • [A\land B](p(y)) = [A](p(y))\land [B](p(y))
  • [A\lor B](p(y)) = [A](p(y))\lor [B](p(y))
  • [A-B](p(y)) = [A](p(y))-[B](p(y))
  • [A\oplus B](p(y)) = [A](p(y))\oplus [B](p(y))
  • [\forall z A](p(y)) = \forall z [A](p(y))
  • [\exists z A](p(y)) = \exists z [A](p(y))
  • [A(\cdot)](p(y)) = [A(p(\cdot))](y)

Treating equality without special rules

Equality can be given by a binary predicate which is a congruence relation for the functions, and compatible with the other predicates. There are different ways to treat equality, and it is important for most interesting applications of predicate logic. A canonical approach is to write the axioms on the left side of the sequent, and use normal deductions. Concrete examples for the required axioms were already given in the last post in the section “Axiomatic theories in predicate logic without falsehood” for the theory of groups and in the section “Axiomatic theories in predicate logic without falsehood” for Robinson arithmetic.

Part of the reason for presenting a subset interpretation for predicate logic was to better understand how predicate logic could work in case of non-classical logics. There are theories like Heyting arithmetic (and Gentzen’s own work), which seem to fit nicely with the interpretation above. But there are also intuitionistic dependent type theories with special rules and equality types, whose possible interpretations remain unclear to me, despite the nice subset interpretation explained above. Maybe Per Martin Löf is right: In the end, everybody must understand for himself (p.166)

Conclusions?

This post was hard for me to write. This is the classical problem when explaining something obvious (predicate calculus). You have to make a lot of words for something which seems obvious, but where the language is missing to properly write down what is going on behind the scenes. I didn’t solve that language problem, but I came up with a natural example of how the obvious thing can fail to be true. That might help the reader to understand for himself.

The content above already became clear to me while writing the previous post. But only after this current post was nearly finished did I notice the restriction r\geq 1 in case p is a partial function. Which is funny, after I clarified the notion of context morphism which makes sense of that logic without truth from the first post (in this series), I find out that there is an additional restriction that the antecedent should never be empty. My first reaction was that I don’t like this. Maybe I should look for a transformation which turns this logic about partial functions into the logic without negation or falsehood. But my second reaction was that it could make sense, perhaps any logical deduction needs assumptions.

Still the question remains whether there is a reasonable transformation from the logic without truth (and partial functions) to the logic without negation and falsehood. Probably that question will be easy to answer. Somehow one has to take complements, and hope that the structure of the predicate calculus remains intact. It seems to work, but I still have trouble to intuitively understand it. And I want to finish this post now.

Advertisements

About gentzen

Logic, Logic, and Logic
This entry was posted in logic, partial functions and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s