DeepAI
Log In Sign Up

Interactive Visualization of Saturation Attempts in Vampire

01/13/2020
by   Bernhard Gleiss, et al.
0

Many applications of formal methods require automated reasoning about system properties, such as system safety and security. To improve the performance of automated reasoning engines, such as SAT/SMT solvers and first-order theorem prover, it is necessary to understand both the successful and failing attempts of these engines towards producing formal certificates, such as logical proofs and/or models. Such an analysis is challenging due to the large number of logical formulas generated during proof/model search. In this paper we focus on saturation-based first-order theorem proving and introduce the SATVIS tool for interactively visualizing saturation-based proof attempts in first-order theorem proving. We build SATVIS on top of the world-leading theorem prover VAMPIRE, by interactively visualizing the saturation attempts of VAMPIRE in SATVIS. Our work combines the automatic layout and visualization of the derivation graph induced by the saturation attempt with interactive transformations and search functionality. As a result, we are able to analyze and debug (failed) proof attempts of VAMPIRE. Thanks to its interactive visualisation, we believe SATVIS helps both experts and non-experts in theorem proving to understand first-order proofs and analyze/refine failing proof attempts of first-order provers.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

08/09/2018

Proof Simplification and Automated Theorem Proving

The proofs first generated by automated theorem provers are far from opt...
10/21/2022

Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

The formalization of existing mathematical proofs is a notoriously diffi...
07/05/2019

From LCF to Isabelle/HOL

Interactive theorem provers have developed dramatically over the past fo...
08/26/2019

Reconstructing veriT Proofs in Isabelle/HOL

Automated theorem provers are now commonly used within interactive theor...
05/10/2015

Automating change of representation for proofs in discrete mathematics

Representation determines how we can reason about a specific problem. So...
04/06/2022

Modular pre-processing for automated reasoning in dependent type theory

The power of modern automated theorem provers can be put at the service ...
10/09/2021

Toward Hole-Driven Development with Liquid Haskell

Liquid Haskell is an extension to the Haskell programming language that ...

1 Introduction

Many applications of formal methods, such as program analysis and verification, require automated reasoning about system properties, such as program safety, security and reliability. Automated reasoners, such as SAT/SMT solvers [1, 5] and first-order theorem provers [9, 13], have therefore become a key backbone of rigorous system engineering. For example, proving properties over the computer memory relies on first-order reasoning with both quantifiers and integer arithmetic.

Saturation-based theorem proving is the leading approach for automating reasoning in full first-order logic. In a nutshell, this approach negates a given goal and saturates its given set of input formulas (including the negated goal), by deriving logical consequences of the input using a logical inference system, such as binary resolution or superposition. Whenever a contradiction (false) is derived, the saturation process terminates reporting validity of the input goal. State-of-the-art theorem provers, such as Vampire [9] and E [13], implement saturation-based proof search using the (ordered) superposition calculus [11]. These provers rely on powerful indexing algorithms, selection functions and term orderings for making saturation-based theorem proving efficient and scalable to a large set of first-order formulas, as evidenced in the yearly CASC system competition of first-order provers [14].

Over the past years, saturation-based theorem proving has been extended to first-order logic with theories, such as arithmetic, theory of arrays and algebraic datatypes [8]. Further, first-class boolean sorts and if-then-else and let-in constructs have also been introduced as extensions to the input syntax of first-order theorem provers [7]. Thanks to these recent developments, first-order theorem provers became better suited in applications of formal methods, being for example a competitive alternative to SMT-solvers [1, 5] in software verification and program analysis. Recent editions of the SMT-COMP111https://smt-comp.github.io/ and CASC system competitions show, for example, that Vampire successfully competes against the leading SMT solvers Z3 [5] and CVC4 [1] and vice-versa.

By leveraging the best practices in first-order theorem proving in combination with SMT solving, in our recent work [3] we showed that correctness of a software program can be reduced to a validity problem in first-order logic. We use Vampire to prove the resulting encodings, outperforming SMT solvers. Our initial results demonstrate that first-order theorem proving is well-suited for applications of (relational) verification, such as safety and non-interference. Yet, our results also show that the performance of the prover crucially depends on the logical representation of its input problem and the deployed reasoning strategies during proof search. As such, users and developers of first-order provers, and automated reasoners in general, typically face the burden of analysing (failed) proof attempts produced by the prover, with the ultimate goal to refine the input and/or proof strategies making the prover succeed in proving its input. Understanding (some of) the reasons why the prover failed is however very hard and requires a considerable amount of work by highly qualified experts in theorem proving, hindering thus the use of theorem provers in many application domains.

In this paper we address this challenge and introduce the SatVis tool to ease the task of analysing failed proof attempts in saturation-based reasoning. We designed SatVis to support interactive visualization of the saturation algorithm used in Vampire, with the goal to ease the manual analysis of Vampire proofs as well as failed proof attempts in Vampire. Inputs to SatVis are proof (attempts) produced by Vampire. Our tool consists of (i) an explicit visualization of the DAG-structure of the saturation proof (attempt) of Vampire and (ii) interactive transformations of the DAG for pruning and reformatting the proof (attempt). In its current setting, SatVis can be used only in the context of Vampire. Yet, by parsing/translating proofs (or proof attempts) of other provers into the Vampire proof format, SatVis can be used in conjunction with other provers as well.

When feeding Vampire proofs to SatVis, SatVis supports both users and developers of Vampire to understand and refactor Vampire proofs, and to manually proof check soundness of Vampire proofs. When using SatVis on failed proof attempts of Vampire, SatVis supports users and developers of Vampire to analyse how Vampire explored its search space during proof search, that is, to understand which clauses were derived and why certain clauses have not been derived at various steps during saturation. By doing so, the SatVis proof visualisation framework gives valuable insights on how to revise the input problem encoding of Vampire and/or implement domain-specific optimizations in Vampire. We therefore believe that SatVis improves the state-of-the-art in the use and applications of theorem proving at least in the following scenarios: (i) helping Vampire developers to debug and further improve Vampire, (ii) helping Vampire users to tune Vampire to their applications, by not treating Vampire as a black-box but by understanding and using its appropriate proof search options; and (iii) helping unexperienced users in saturation-based theorem proving to learn using Vampire and first-order proving in general.

Contributions.

The contribution of this paper comes with the design of the SatVis tool for analysing proofs, as well as proof attempts of the Vampire theorem prover. SatVis is available at:

We overview proof search steps in Vampire specific to SatVis (Section 2), discuss the challenges we faced for analysing proof attempts of Vampire (Section 3), and describe implementation-level details of SatVis 1.0 (Section 4).

Related work.

While standardizing the input format of automated reasoners is an active research topic, see e.g. the SMT-LIB [2] and TPTP [14] standards, coming up with an input standard for representing and analysing proofs and proof attempts of automated reasoners has received so far very little attention. The TSTP library [14] provides input/output standards for automated theorem proving systems. Yet, unlike SatVis, TSTP does not analyse proof attempts but only supports the examination of first-order proofs. We note that Vampire proofs (and proof attempts) contain first-order formulas with theories, which is not fully supported by TSTP.

Using a graph-layout framework, for instance Graphviz [6], it is relatively straightforward to visualize the DAG derivation graph induced by a saturation attempt of a first-order prover. For example, the theorem prover E [13] is able to directly output its saturation attempt as an input file for Graphviz. The visualizations generated in this way are useful however only for analyzing small derivations with at most 100 inferences, but cannot practically be used to analyse and manipulate larger proof attempts. We note that it is quite common to have first-order proofs and proof attempts with more than 1,000 or even 10,000 inferences, especially in applications of theorem proving in software verification, see e.g. [3]. In our SatVis framework, the interactive features of our tool allow one to analyze such large(r) proof attempts.

The framework [12] eases the manual analysis of proof attempts in Z3 [5] by visualizing quantifier instantiations, case splits and conflicts. While both [12] and SatVis are built for analyzing (failed) proof attempts, they target different architectures (SMT-solving resp. superposition-based proving) and therefore differ in their input format and in the information they visualize. The frameworks [4, 10] visualize proofs derived in a natural deduction/sequent calculus. Unlike these approaches, SatVis targets clausal derivations generated by saturation-based provers using the superposition inference system. As a consequence, our tool can be used to focus only on the clauses that have been actively used during proof search, instead of having to visualize the entire set of clauses, including unused clauses during proof search. We finally note that proof checkers, such as DRAT-trim [15], support the soundness analysis of each inference step of a proof, and do not focus on failing proof attempts nor do they visualize proofs.

2 Proof Search in Vampire

We first present the key ingredients for proof search in Vampire, relevant to analysing saturation attempts.

Derivations and proofs.

An inference is a tuple , where are formulas. The formulas are called the premises of and is called the conclusion of . In our setting, an inference system is a set of inferences and we rely on the superposition inference systems [11]. An axiom of an inference system is any inference with . Given an inference system , a derivation from axioms is an acyclic directed graph (DAG), where (i) each node is a formula and (ii) each node either is an axiom in and does not have any incoming edges, or is a formula , such that the incoming edges of are exactly and there exists an inference . A refutation of axioms is a derivation which contains the empty clause as a node. A derivation of a formula is called a proof of if it is finite and all leaves in the derivation are axioms.

Proof search in Vampire.

Given an input set of axioms and a conjecture , Vampire searches for a refutation of , by using a preprocessing phase followed by a saturation phase. In the preprocessing phase, Vampire generates a derivation from such that each sink-node of the DAG222a sink-node is a node such that no edge emerges out of it. is a clause. Then, Vampire enters the saturation phase, where it extends the existing derivation by applying its saturation algorithm using the sink-nodes from the preprocessing phase as the input clauses to saturation. The saturation phase of Vampire terminates in either of the following three cases: (i) the empty clause is derived (hence, a proof of was found), (ii) no more clauses are derived and the empty clause was not derived (hence, the input is saturated and is satisfiable), or (iii) an a priory given time/memory limit on the Vampire run is reached (hence, it is unknown whether is satisfiable/valid).

Saturation-based proving in Vampire is performed using the following high-level description of the saturation phase of Vampire. The saturation algorithm divides the set of clauses from the proof space of Vampire into a set of and clauses, and iteratively refines these sets using its superposition inference system: the set keeps the clauses between which all possible inferences have been performed, whereas the set stores the clauses which have not been added to yet and are candidates for being used in future steps of the saturation algorithm. During saturation, Vampire distinguishes between so-called simplifying and generating inferences. Intuitively, simplifying inferences delete clauses from the search space and hence are crucial for keeping the search space small. A generating inference is a non-simplifying one, and hence adds new clauses to the search space. As such, at every iteration of the saturation algorithm, a new clause from is selected and added to , after which all generating inferences between the selected clause and the clauses in are applied. Conclusions of these inferences yield new clauses which are added to to be selected in future iterations of saturation. Additionally at any step of the saturation algorithm, simplifying inferences and deletion of clauses are allowed.

3 Analysis of Saturation Attempts of Vampire

We now discuss how to efficiently analyze saturation attempts of Vampire in SatVis.

Analyzing saturation attempts.

To understand saturation (attempts), we have to analyze the generating inferences performed during saturation (attempts).

On the one hand, we are interested in the useful clauses: that is, the derived and activated clauses that are part of the proof we expect Vampire to find. In particular, we check whether these clauses occur in . (i) If this is the case for a given useful clause (or a simplified variant of it), we are done with processing this useful clause and optionally check the derivation of that clause against the expected derivation. (ii) If not, we have to identify the reason why the clause was not added to , which can either be the case because (ii.a) the clause (or a simplified version of it) was never chosen from to be activated or (ii.b) the clause was not even added to . In case (ii.a), we investigate why the clause was not activated. This involves checking which simplified version of the clause was added to and checking the value of clause selection in Vampire on that clause. In case (ii.b), it is needed to understand why the clause was not added to , that is, why no generating inference between suitable premise clauses was performed. This could for instance be the case because one of the premises was not added to , in which case we recurse with the analysis on that premise, or because clause selection in Vampire prevented the inference.

On the other hand, we are interested in the useless clauses: that is, the clauses which were generated or even activated but are unrelated to the proof Vampire will find. These clauses often slow down the proof search by several magnitudes. It is therefore crucial to limit their generation or at least their activation. To identify the useless clauses that are activated, we need to analyze the set , whereas to identify the useless clauses, which are generated but never activated, we have to investigate the set .

[fontsize=] … [SA] passive: 160. v = a(l11(s(nl8)),

Figure 1: ScreenshotofasaturationattemptofVampire.

Saturation output.

We now discuss how SatVis reconstructs the clause sets and from a Vampire saturation (attempt). Vampire

is able to log a list of events, where each event is classified as either (i) new

(ii) passive or (iii) active , for a given clause . The list of events produced by Vampire satisfies the following properties: (a) any clause is at most once newly created, added to and added to ; (b) if a clause is added to , it was newly created in the same iteration, and (c) if a clause is added to , it was newly created and added to at some point. Figure 3 shows a part of the output logged by Vampire while performing a saturation attempt (SA).

Starting from an empty derivation and two empty sets, the derivation graph and the sets and corresponding to a given saturation attempt of Vampire are computed in SatVis by traversing the list of events produced by Vampire and iteratively changing the derivation and the sets and , as follows:

  1. [label=()]

  2. new : add the new node to the derivation and construct the edges , for any premise of the inference deriving . The sets or remain unchanged;

  3. passive : add the node to . The derivation and remain unchanged;

  4. active : remove the node from and add it to . The derivation remains unchanged.

Interactive Visualization.

The large number of inferences during saturation in Vampire makes the direct analysis of saturation attempts of Vampire impossible within a reasonable amount of time. In order to overcome this problem, in SatVis we interactively visualize the derivation graph of the Vampire saturation. The graph-based visualization of SatVis brings the following benefits:

Navigating through the graph visualization of a Vampire derivation is easier for users rather than working with the Vampire derivation encoded as a list of hyper-edges. In particular, both (i) navigating to the premises of a selected node/clause and (ii) searching for inferences having a selected node/clause as premise is performed fast in SatVis.

SatVis visualizes only the nodes/clauses that are part of a derivation of an activated clause, and in this way ignores uninteresting inferences.

SatVis merges the preprocessing inferences, such that each clause resulting from preprocessing has as direct premise the input formula it is derived from.

Yet, a straightforward graph-based visualization of Vampire saturations in SatVis would bring the following practical limitations on using SatVis:

(i) displaying additional meta-information on graph nodes, such as the inference rule used to derive a node, is computationally very expensive, due to the large number of inferences used during saturation;

(ii) manual search for particular/already processed nodes in relatively large derivations would take too much time;

(iii) subderivations are often interleaved with other subderivations due to an imperfect automatic layout of the graph.

SatVis addresses the above challenges using its following interactive features:

  • SatVis displays meta-information only for a selected node/clause;

  • SatVis supports different ways to locate and select clauses, such as full-text search, search for direct children and premises of the currently selected clauses, and search for clauses whose derivation contains all currently selected nodes;

  • SatVis supports transformations/fragmentations of derivations. In particular, it is possible to restrict and visualize the derivation containing only the clauses that form the derivation of a selected clause, or visualize only clauses whose derivation contains a selected clause.

  • SatVis allows to (permanently) highlight one or more clauses in the derivation.

Figure 2 illustrates some of the above feature of SatVis, using output from Vampire similar to Figure 3 as input to SatVis.

Figure 2: Screenshot of SatVis showing visualized derivation and interaction menu.

4 Implementation of SatVis 1.0

We implemented SatVis as a web application, allowing SatVis to be easily used on any platform. Written in Python3, SatVis contains about 2,200 lines of code. For the generation of graph layouts, we rely on pygraphviz333https://pygraphviz.github.io, whereas graph/derivation visualizations are created with vis.js444https://visjs.org/. We experimented with SatVis on the verification examples of [3], using an Intel Core i5 3.1Ghz machine with 16 GB of RAM, allowing us to refine and successfully generate Vampire proofs for non-interference and information-flow examples of [3].

SatVis workflow.

SatVis takes as input a text file containing the output of a Vampire saturation attempt. An example of a partial input to SatVis is given in Figure 3. SatVis then generates a DAG representing the derivation of the considered Vampire saturation output, as presented in Section 3 and discussed later. Next, SatVis generates the graph layout of for the generated DAG, enriched with configured style information. Finally, SatVis renders and visualizes the Vampire derivation corresponding to its input, and allows interactive visualisations of its output, as discussed in Section 3 and detailed below.

DAG generation of saturation outputs.

SatVis

parses its input line by line using regex pattern matching in order to generate the nodes of the graph. Next,

SatVis uses a post order traversal algorithm to sanitize nodes and remove redundant ones. The result is then passed to pygraphviz to generate a graph layout. While pygraphviz finds layouts for thousands of nodes within less than three seconds, we would like to improve the scalability of the tool further.

It would be beneficial to preprocess and render nodes incrementally, while ensuring stable layouts for SatVis graph transformations. We leave this engineering task for future work.

Interactive visualization

The interactive features of SatVis support (i) various node searching mechanisms, (ii) graph transformations, and (iii) the display of meta-information about a specific node. We can efficiently search for nodes by (partial) clause, find parents or children of a node, and find common consequences of a number of nodes. Graph transformations in SatVis allow to only render a certain subset of nodes from the SatVis DAG, for example, displaying only transitive parents or children of a certain node.

5 Conclusion

We described the SatVis tool for interactively visualizing proofs and proof attempts of the first-order theorem prover Vampire. Our work analyses proof search in Vampire and reconstructs first-order derivations corresponding to Vampire proofs/proof attempts. The interactive features of SatVis ease the task of understanding both successful and failing proof attempts in Vampire and hence can be used to further develop and use Vampire both by experts and non-experts in first-order theorem proving.

Acknowledgements. This work was funded by the ERC Starting Grant 2014 SYMCAR 639270, the ERC Proof of Concept Grant 2018 SYMELS 842066, the Wallenberg Academy Fellowship 2014 TheProSE and the Austrian FWF project W1255-N23.

References

  • [1] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanović, T. King, A. Reynolds, and C. Tinelli. CVC4. In CAV, pages 171–177, 2011.
  • [2] C. Barrett, P. Fontaine, and C. Tinelli. The SMT-LIB standard: Version 2.6. Technical report, Department of Computer Science, The University of Iowa, 2017.
  • [3] G. Barthe, R. Eilers, P. Georgiou, B. Gleiss, L. Kovacs, and M. Maffei. Verifying Relational Properties using Trace Logic. In FMCAD, 2019. To appear.
  • [4] J. Byrnes, M. Buchanan, M. Ernst, P. Miller, C. Roberts, and R. Keller. Visualizing proof search for theorem prover development. ENTCS, 226:23 – 38, 2009.
  • [5] L. De Moura and N. Bjørner. Z3: An efficient SMT solver. In TACAS, pages 337–340, 2008.
  • [6] E. R. Gansner and S. C. North. An Open Graph Visualization System and its Applications to Software Engineering. Software- Practice and Experience, 30(11):1203–1233, 2000.
  • [7] E. Kotelnikov, L. Kovács, and A. Voronkov. A FOOLish Encoding of the Next State Relations of Imperative Programs. In IJCAR, pages 405–421, 2018.
  • [8] L. Kovács, S. Robillard, and A. Voronkov. Coming to terms with quantified reasoning. In POPL, pages 260–270. ACM, 2017.
  • [9] L. Kovács and A. Voronkov. First-Order Theorem Proving and Vampire. In CAV, pages 1–35, 2013.
  • [10] T. Libal, M. Riener, and M. Rukhaia. Advanced Proof Viewing in ProofTool. In UITP, pages 35–47, 2014.
  • [11] R. Nieuwenhuis and A. Rubio. Paramodulation-Based Theorem Proving. In Handbook of Automated Reasoning, pages 371–443. 2001.
  • [12] F. Rothenberger. Integration and analysis of alternative smt solvers for software verification. Master’s thesis, ETH Zurich, Zürich, 2016. Masterarbeit. ETH Zürich. 2016.
  • [13] S. Schulz. E - a Brainiac Theorem Prover. AI Communications, 15(2-3):111–126, 2002.
  • [14] G. Sutcliffe. TPTP, TSTP, CASC, etc. In CSR, pages 7–23, 2007.
  • [15] N. Wetzler, M. J. H. Heule, and W. A. Hunt. Drat-trim: Efficient checking and trimming using expressive clausal proofs. In SAT, pages 422–429, 2014.

4 Implementation of SatVis 1.0

We implemented SatVis as a web application, allowing SatVis to be easily used on any platform. Written in Python3, SatVis contains about 2,200 lines of code. For the generation of graph layouts, we rely on pygraphviz333https://pygraphviz.github.io, whereas graph/derivation visualizations are created with vis.js444https://visjs.org/. We experimented with SatVis on the verification examples of [3], using an Intel Core i5 3.1Ghz machine with 16 GB of RAM, allowing us to refine and successfully generate Vampire proofs for non-interference and information-flow examples of [3].

SatVis workflow.

SatVis takes as input a text file containing the output of a Vampire saturation attempt. An example of a partial input to SatVis is given in Figure 3. SatVis then generates a DAG representing the derivation of the considered Vampire saturation output, as presented in Section 3 and discussed later. Next, SatVis generates the graph layout of for the generated DAG, enriched with configured style information. Finally, SatVis renders and visualizes the Vampire derivation corresponding to its input, and allows interactive visualisations of its output, as discussed in Section 3 and detailed below.

DAG generation of saturation outputs.

SatVis

parses its input line by line using regex pattern matching in order to generate the nodes of the graph. Next,

SatVis uses a post order traversal algorithm to sanitize nodes and remove redundant ones. The result is then passed to pygraphviz to generate a graph layout. While pygraphviz finds layouts for thousands of nodes within less than three seconds, we would like to improve the scalability of the tool further.

It would be beneficial to preprocess and render nodes incrementally, while ensuring stable layouts for SatVis graph transformations. We leave this engineering task for future work.

Interactive visualization

The interactive features of SatVis support (i) various node searching mechanisms, (ii) graph transformations, and (iii) the display of meta-information about a specific node. We can efficiently search for nodes by (partial) clause, find parents or children of a node, and find common consequences of a number of nodes. Graph transformations in SatVis allow to only render a certain subset of nodes from the SatVis DAG, for example, displaying only transitive parents or children of a certain node.

5 Conclusion

We described the SatVis tool for interactively visualizing proofs and proof attempts of the first-order theorem prover Vampire. Our work analyses proof search in Vampire and reconstructs first-order derivations corresponding to Vampire proofs/proof attempts. The interactive features of SatVis ease the task of understanding both successful and failing proof attempts in Vampire and hence can be used to further develop and use Vampire both by experts and non-experts in first-order theorem proving.

Acknowledgements. This work was funded by the ERC Starting Grant 2014 SYMCAR 639270, the ERC Proof of Concept Grant 2018 SYMELS 842066, the Wallenberg Academy Fellowship 2014 TheProSE and the Austrian FWF project W1255-N23.

References

  • [1] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanović, T. King, A. Reynolds, and C. Tinelli. CVC4. In CAV, pages 171–177, 2011.
  • [2] C. Barrett, P. Fontaine, and C. Tinelli. The SMT-LIB standard: Version 2.6. Technical report, Department of Computer Science, The University of Iowa, 2017.
  • [3] G. Barthe, R. Eilers, P. Georgiou, B. Gleiss, L. Kovacs, and M. Maffei. Verifying Relational Properties using Trace Logic. In FMCAD, 2019. To appear.
  • [4] J. Byrnes, M. Buchanan, M. Ernst, P. Miller, C. Roberts, and R. Keller. Visualizing proof search for theorem prover development. ENTCS, 226:23 – 38, 2009.
  • [5] L. De Moura and N. Bjørner. Z3: An efficient SMT solver. In TACAS, pages 337–340, 2008.
  • [6] E. R. Gansner and S. C. North. An Open Graph Visualization System and its Applications to Software Engineering. Software- Practice and Experience, 30(11):1203–1233, 2000.
  • [7] E. Kotelnikov, L. Kovács, and A. Voronkov. A FOOLish Encoding of the Next State Relations of Imperative Programs. In IJCAR, pages 405–421, 2018.
  • [8] L. Kovács, S. Robillard, and A. Voronkov. Coming to terms with quantified reasoning. In POPL, pages 260–270. ACM, 2017.
  • [9] L. Kovács and A. Voronkov. First-Order Theorem Proving and Vampire. In CAV, pages 1–35, 2013.
  • [10] T. Libal, M. Riener, and M. Rukhaia. Advanced Proof Viewing in ProofTool. In UITP, pages 35–47, 2014.
  • [11] R. Nieuwenhuis and A. Rubio. Paramodulation-Based Theorem Proving. In Handbook of Automated Reasoning, pages 371–443. 2001.
  • [12] F. Rothenberger. Integration and analysis of alternative smt solvers for software verification. Master’s thesis, ETH Zurich, Zürich, 2016. Masterarbeit. ETH Zürich. 2016.
  • [13] S. Schulz. E - a Brainiac Theorem Prover. AI Communications, 15(2-3):111–126, 2002.
  • [14] G. Sutcliffe. TPTP, TSTP, CASC, etc. In CSR, pages 7–23, 2007.
  • [15] N. Wetzler, M. J. H. Heule, and W. A. Hunt. Drat-trim: Efficient checking and trimming using expressive clausal proofs. In SAT, pages 422–429, 2014.

5 Conclusion

We described the SatVis tool for interactively visualizing proofs and proof attempts of the first-order theorem prover Vampire. Our work analyses proof search in Vampire and reconstructs first-order derivations corresponding to Vampire proofs/proof attempts. The interactive features of SatVis ease the task of understanding both successful and failing proof attempts in Vampire and hence can be used to further develop and use Vampire both by experts and non-experts in first-order theorem proving.

Acknowledgements. This work was funded by the ERC Starting Grant 2014 SYMCAR 639270, the ERC Proof of Concept Grant 2018 SYMELS 842066, the Wallenberg Academy Fellowship 2014 TheProSE and the Austrian FWF project W1255-N23.

References

  • [1] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanović, T. King, A. Reynolds, and C. Tinelli. CVC4. In CAV, pages 171–177, 2011.
  • [2] C. Barrett, P. Fontaine, and C. Tinelli. The SMT-LIB standard: Version 2.6. Technical report, Department of Computer Science, The University of Iowa, 2017.
  • [3] G. Barthe, R. Eilers, P. Georgiou, B. Gleiss, L. Kovacs, and M. Maffei. Verifying Relational Properties using Trace Logic. In FMCAD, 2019. To appear.
  • [4] J. Byrnes, M. Buchanan, M. Ernst, P. Miller, C. Roberts, and R. Keller. Visualizing proof search for theorem prover development. ENTCS, 226:23 – 38, 2009.
  • [5] L. De Moura and N. Bjørner. Z3: An efficient SMT solver. In TACAS, pages 337–340, 2008.
  • [6] E. R. Gansner and S. C. North. An Open Graph Visualization System and its Applications to Software Engineering. Software- Practice and Experience, 30(11):1203–1233, 2000.
  • [7] E. Kotelnikov, L. Kovács, and A. Voronkov. A FOOLish Encoding of the Next State Relations of Imperative Programs. In IJCAR, pages 405–421, 2018.
  • [8] L. Kovács, S. Robillard, and A. Voronkov. Coming to terms with quantified reasoning. In POPL, pages 260–270. ACM, 2017.
  • [9] L. Kovács and A. Voronkov. First-Order Theorem Proving and Vampire. In CAV, pages 1–35, 2013.
  • [10] T. Libal, M. Riener, and M. Rukhaia. Advanced Proof Viewing in ProofTool. In UITP, pages 35–47, 2014.
  • [11] R. Nieuwenhuis and A. Rubio. Paramodulation-Based Theorem Proving. In Handbook of Automated Reasoning, pages 371–443. 2001.
  • [12] F. Rothenberger. Integration and analysis of alternative smt solvers for software verification. Master’s thesis, ETH Zurich, Zürich, 2016. Masterarbeit. ETH Zürich. 2016.
  • [13] S. Schulz. E - a Brainiac Theorem Prover. AI Communications, 15(2-3):111–126, 2002.
  • [14] G. Sutcliffe. TPTP, TSTP, CASC, etc. In CSR, pages 7–23, 2007.
  • [15] N. Wetzler, M. J. H. Heule, and W. A. Hunt. Drat-trim: Efficient checking and trimming using expressive clausal proofs. In SAT, pages 422–429, 2014.

References

  • [1] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanović, T. King, A. Reynolds, and C. Tinelli. CVC4. In CAV, pages 171–177, 2011.
  • [2] C. Barrett, P. Fontaine, and C. Tinelli. The SMT-LIB standard: Version 2.6. Technical report, Department of Computer Science, The University of Iowa, 2017.
  • [3] G. Barthe, R. Eilers, P. Georgiou, B. Gleiss, L. Kovacs, and M. Maffei. Verifying Relational Properties using Trace Logic. In FMCAD, 2019. To appear.
  • [4] J. Byrnes, M. Buchanan, M. Ernst, P. Miller, C. Roberts, and R. Keller. Visualizing proof search for theorem prover development. ENTCS, 226:23 – 38, 2009.
  • [5] L. De Moura and N. Bjørner. Z3: An efficient SMT solver. In TACAS, pages 337–340, 2008.
  • [6] E. R. Gansner and S. C. North. An Open Graph Visualization System and its Applications to Software Engineering. Software- Practice and Experience, 30(11):1203–1233, 2000.
  • [7] E. Kotelnikov, L. Kovács, and A. Voronkov. A FOOLish Encoding of the Next State Relations of Imperative Programs. In IJCAR, pages 405–421, 2018.
  • [8] L. Kovács, S. Robillard, and A. Voronkov. Coming to terms with quantified reasoning. In POPL, pages 260–270. ACM, 2017.
  • [9] L. Kovács and A. Voronkov. First-Order Theorem Proving and Vampire. In CAV, pages 1–35, 2013.
  • [10] T. Libal, M. Riener, and M. Rukhaia. Advanced Proof Viewing in ProofTool. In UITP, pages 35–47, 2014.
  • [11] R. Nieuwenhuis and A. Rubio. Paramodulation-Based Theorem Proving. In Handbook of Automated Reasoning, pages 371–443. 2001.
  • [12] F. Rothenberger. Integration and analysis of alternative smt solvers for software verification. Master’s thesis, ETH Zurich, Zürich, 2016. Masterarbeit. ETH Zürich. 2016.
  • [13] S. Schulz. E - a Brainiac Theorem Prover. AI Communications, 15(2-3):111–126, 2002.
  • [14] G. Sutcliffe. TPTP, TSTP, CASC, etc. In CSR, pages 7–23, 2007.
  • [15] N. Wetzler, M. J. H. Heule, and W. A. Hunt. Drat-trim: Efficient checking and trimming using expressive clausal proofs. In SAT, pages 422–429, 2014.