Goal-Oriented Conjecturing for Isabelle/HOL

We present PGT, a Proof Goal Transformer for Isabelle/HOL. Given a proof goal and its background context, PGT attempts to generate conjectures from the original goal by transforming the original proof goal. These conjectures should be weak enough to be provable by automation but sufficiently strong to identify and prove the original goal. By incorporating PGT into the pre-existing PSL framework, we exploit Isabelle's strong automation to identify and prove such conjectures.



There are no comments yet.


page 1

page 2

page 3

page 4


Towards Smart Proof Search for Isabelle

Despite the recent progress in automatic theorem provers, proof engineer...

Learning to Reason with HOL4 tactics

Techniques combining machine learning with translation to automated reas...

Tactic Learning and Proving for the Coq Proof Assistant

We present a system that utilizes machine learning for tactic proof sear...

Up-to Techniques for Branching Bisimilarity

Ever since the introduction of behavioral equivalences on processes one ...

Cooperation between Top-Down and Bottom-Up Theorem Provers

Top-down and bottom-up theorem proving approaches each have specific adv...

Learning algorithms versus automatability of Frege systems

We connect learning algorithms and algorithms automating proof search in...

The Tactician (extended version): A Seamless, Interactive Tactic Learner and Prover for Coq

We present Tactician, a tactic learner and prover for the Coq Proof Assi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Consider the following two reverse functions defined in literature [tutorial]:

  primrec itrev:: "’a list  ’a list  ’a list" where
   "itrev [] ys = ys" | "itrev (x#xs) ys = itrev xs (x#ys)"
  primrec rev :: "’a list  ’a list" where
   "rev [] = []" | "rev (x # xs) = rev xs @ [x]"

How would you prove their equivalence "itrev xs [] = rev xs"? Induction comes to mind. However, it turns out that Isabelle’s default proof methods, induct and induct_tac, are unable to handle this proof goal effectively.

Previously, we developed PSL [psl], a programmable, meta-tool framework for Isabelle/HOL. With PSL one can write the following strategy for induction:

  strategy DInd = Thens [Dynamic (Induct), Auto, IsSolved]

PSL’s Dynamic keyword creates variations of the induct method by specifying different combinations of promising arguments found in the proof goal and its background proof context. Then, DInd combines these induction methods with the general purpose proof method, auto, and is_solved, which checks if there is any proof goal left after applying auto. As shown in Fig. 0(a), PSL keeps applying the combination of a specialization of induct method and auto, until either auto discharges all remaining sub-goals or DInd runs out of the variations of induct methods as shown in Fig. 0(a).

This approach works well only if the resulting sub-goals after applying some induct are easy enough for Isabelle’s automated tools (such as auto in DInd) to prove. When proof goals are presented in an automation-unfriendly way, however, it is not enough to set a certain combination of arguments to the induct method. In such cases engineers have to investigate the original goal and come up with auxiliary lemmas, from which they can derive the original goal.

In this paper, we present PGT, a novel design and prototype implementation111available at Github https://github.com/data61/PSL/releases/tag/v0.1.1. The example of this paper appears in PSL/PGT/Example.thy. of a conjecturing tool for Isabelle/HOL. We provide PGT as an extension to PSL to facilitate the seamless integration with other Isabelle sub-tools. Given a proof goal, PGT produces a series of conjectures that might be useful in discharging the original goal, and PSL attempts to identify the right one while searching for a proof of the original goal using those conjectures.

2 System Description

2.1 Identifying Valuable Conjectures via Proof Search

To automate conjecturing, we added the new language primitive, Conjecture to PSL. Given a proof goal, Conjecture first produces a series of conjectures that might be useful in proving the original theorem, following the process described in Section 2.2. For each conjecture, PGT creates a subgoal_tac method and inserts the conjecture as the premise of the original goal. When applied to "itrev xs [] = rev xs", for example, Conjecture generates the following proof method along with 130 other variations of the subgoal_tac method:

  apply (subgoal_tac "!!Nil. itrev xs Nil = rev xs @ Nil")

where !! stands for the universal quantifier in Isabelle’s meta-logic. Namely, Conjecture introduced a variable of name Nil for the constant []. Applying this method to the goal results in the following two new sub-goals:

  1. (!!Nil. itrev xs Nil = rev xs @ Nil) ==> itrev xs [] = rev xs
  2. !!Nil. itrev xs Nil = rev xs @ Nil

Conjecture alone cannot determine which conjecture is useful for the original goal. In fact, some of the generated statements are not even true or provable. To discard these non-theorems and to reduce the size of PSL’s search space, we combine Conjecture with Fastforce (corresponding to the fastforce method) and Quickcheck (corresponding to Isabelle’s sub-tool quickcheck [quickcheck]) sequentially as well as DInd as follows:

  strategy CDInd = Thens [Conjecture, Fastforce, Quickcheck, DInd]

Importantly, fastforce does not return an intermediate proof goal: it either discharges the first sub-goal completely or fails by returning an empty sequence. Therefore, whenever fastforce returns a new proof goal to a sub-goal resulting from subgoal_tac, it guarantees that the conjecture inserted as a premise is strong enough for Isabelle to prove the original goal. In our example, the application of fastforce to the aforementioned first sub-goal succeeds, changing the remaining sub-goals to the following:

  1. !!Nil. itrev xs Nil = rev xs @ Nil

However, PSL still has to deal with many non-theorems: non-theorems are often strong enough to imply the original goal due to the principle of explosion. Therefore, CDInd applies Quickcheck to discard easily refutable non-theorems. The atomic strategy Quickcheck returns the same sub-goal only if Isabelle’s sub-tool quickcheck does not find a counter example, but returns an empty sequence otherwise.

Now we know that the remaining conjectured goals are strong enough to imply the original goal and that they are not easily refutable. Therefore, CDInd applies its sub-strategy DInd to the remaining sub-goals and it stops its proof search as soon as it finds the following proof script, which will be printed in Isabelle/jEdit’s output panel.

    apply (subgoal_tac "!!Nil. itrev xs Nil = rev xs @ Nil")
    apply fastforce apply (induct xs) apply auto done

Fig. 0(b) shows how CDInd narrows its search space in a top-down manner. Note that PSL lets you use other Isabelle sub-tools to prune conjectures. For example, you can use both nitpick [nitpick] and quickcheck: Thens [Quickcheck, Nitpick] in CDInd. It also let you combine DInd and CDInd into one: Ors [DInd, CDInd].

(a) Search tree of DInd
(b) Search tree of CDInd
Figure 1: PSL’s proof search with/without PGT.

2.2 Conjecturing

Section 2.1 has described how we identify useful conjectures. Now, we will focus on how PGT creates conjectures in the first place. PGT introduced both automatic conjecturing (Conjecture) and automatic generalization (Generalize). Since the conjecturing functionality uses generalization, we will only describe the former. We now walk through the main steps that lead from a user defined goal to a set of potentially useful conjectures, as illustrated in Fig. 2.

Extract constants and common sub-terms from the original goal

Generalize to produce

Call conjecture for goal oriented conjecturing (Fig. 3) for each and

Clean & return

Figure 2: The overall workflow of Conjecture.

We start with the extraction of constants and sub-terms, continue with generalization, goal oriented conjecturing, and finally describe how the resulting terms are sanitized.

Extraction of Constants and Common Sub-terms.

Given a term representation of the original goal, PGT extracts the constants and sub-terms that appear multiple times in . In the example from Section 1, PGT collects the constants rev, itrev, and [].


Now, PGT tries to generalize the goal . Here, PGT alone cannot determine over which constant or sub-terms it should generalize . Hence, it creates a generalized version of  for each constant and sub-term collected in the previous step. For [] in the running example, PGT creates the following generalized version of : !!Nil. itrev xs Nil = rev xs.

Goal Oriented Conjecturing.

This step calls the function conjecture, illustrated in Fig. 3, with the original goal and each of the generalized versions of from the previous step ().

Input: the original goal and generalized versions of ()

Extract constants in and

For each constant extracted above, find related constants from the corresponding simp rules

Traverse generalized conjectures and mutate their sub-terms in a top-down manner

Figure 3: The workflow of the conjecture function.

The following code snippet shows part of conjecture:

  fun cnjcts t = flat (map (get_cnjct generalisedT t) consts)
  fun conj (trm as Abs (_,_,subtrm)) = cnjcts trm @ conj subtrm
   |  conj (trm as App (t1,t2)) = cnjcts trm @ conj t1 @ conj t2
   |  conj trm = cnjcts trm

For each and for , conjecture first calls conj, which traverses the term structure of each or in a top-down manner. In the running example, PGT takes some , say !!Nil. itrev xs Nil = rev xs, as an input and applies conj to it.

For each sub-term the function get_cnjct in cnjcts creates new conjectures by replacing the sub-term (t in cnjcts) in or (generalisedT) with a new term. This term is generated from the sub-term (t) and the constants (consts). These are obtained from simplification rules that are automatically derived from the definition of a constant that appears in the corresponding or .

In the example, PGT first finds the constant rev within . Then, PGT finds the simp-rule (rev.simps(2)) relevant to rev which states, rev (?x # ?xs) = rev ?xs @ [?x], in the background context. Since rev.simps(2) uses the constant @, PGT attempts to create new sub-terms using @ while traversing in the syntax tree of !!Nil. itrev xs Nil = rev xs in a top-down manner.

When conj reaches the sub-term rev xs, get_cnjct creates new sub-terms using this sub-term, @ (an element in consts), and the universally quantified variable Nil. One of these new sub-terms would be rev xs @ Nil222Note that Nil is a universally quantified variable here.. Finally, get_cnjct replaces the original sub-term rev xs with this new sub-term in , producing the conjecture: !!Nil. itrev xs Nil = rev xs @ Nil.

Note that this conjecture is not the only conjecture produced in this step: PGT, for example, also produces !!Nil. itrev xs Nil = Nil @ rev xs, by replacing rev xs with Nil @ rev xs, even though this conjecture is a non-theorem. Fig. 4 illustrates the sequential application of generalization in the previous paragraph and goal oriented conjecturing described in this paragraph.

Figure 4: PSL’s sequential generalization and goal oriented conjecturing.

Clean & Return

Most produced conjectures do not even type check. This step removes them as well as duplicates before passing the results to the following sub-strategy (Then [Fastforce, Quickcheck, DInd] in the example).

3 Conclusion

We presented an automatic conjecturing tool PGT and its integration into PSL. Currently, PGT

tries to generate conjectures using previously derived simplification rules as hints. We plan to include more heuristics to prioritize conjectures before passing them to subsequent strategies.

Most conjecturing tools for Isabelle, such as IsaCoSy [isacosy] and Hipster [hipster], are based on the bottom-up approach called theory exploration [theoryexploration]. The drawback is that they tend to produce uninteresting conjectures. In the case of IsaCoSy the user is tasked with pruning these by hand. Hipster uses the difficulty of a conjecture’s proof to determine or measure its usefulness. Contrary to their approach, PGT produces conjectures by mutating original goals. Even though PGT also produces unusable conjectures internally, the integration with PSL’s search framework ensures that PGT only presents conjectures that are indeed useful in proving the original goal. Unlike Hipster, which is based on a Haskell code base, PGT and PSL are an Isabelle theory file, which can easily be imported to any Isabelle theory. Finally, unlike Hipster, PGT is not limited to equational conjectures.

Gauthier et al. described conjecturing across proof corpora [tgck-lpar15]. While PGT creates conjectures by mutating the original goal, Gauthier et al. produced conjectures by using statistical analogies extracted from large formal libraries [conjecture].



Figure 5: Screenshot of Isabelle/HOL with PGT.