1 Introduction
Many applications of formal methods, such as program analysis and verification,
require automated reasoning about system properties, such as program safety, security and reliability.
Automated reasoners, such as SAT/SMT solvers [1, 5] and firstorder theorem provers [9, 13],
have therefore become a key backbone of rigorous system engineering.
For example, proving properties over the computer memory relies on firstorder reasoning with both quantifiers and
integer
arithmetic.
Saturationbased theorem proving is the leading approach for automating reasoning in full firstorder logic.
In a nutshell, this approach negates a given goal and saturates its given set of input formulas (including the negated goal),
by deriving logical consequences of the input using a logical inference system, such as binary resolution or superposition.
Whenever a contradiction (false) is derived, the saturation process terminates reporting validity of the input goal.
Stateoftheart theorem provers, such as Vampire [9] and E [13], implement saturationbased proof search using the (ordered) superposition calculus [11].
These provers rely on powerful indexing algorithms, selection functions and term orderings for making saturationbased theorem proving efficient and scalable to a large set of firstorder formulas, as evidenced in the yearly CASC system competition of firstorder provers [14].
Over the past years, saturationbased theorem proving has been extended
to firstorder logic with theories, such as arithmetic, theory of arrays and
algebraic datatypes [8]. Further,
firstclass boolean sorts and ifthenelse and letin constructs have also been introduced as extensions
to the input syntax of firstorder theorem provers [7]. Thanks to these recent developments,
firstorder theorem provers became better suited in applications of formal methods, being for example a competitive alternative to SMTsolvers [1, 5] in software verification and program analysis. Recent editions of the SMTCOMP^{1}^{1}1https://smtcomp.github.io/
and CASC system competitions show, for example, that Vampire successfully competes against the leading SMT solvers Z3 [5] and CVC4 [1] and viceversa.
By leveraging the best practices in firstorder theorem proving in combination with SMT solving,
in our recent work [3] we showed that correctness of a software program can be reduced to a validity problem in firstorder logic. We use Vampire to prove the resulting encodings, outperforming SMT solvers. Our initial results demonstrate that firstorder theorem proving is wellsuited for applications of (relational) verification, such as safety and noninterference. Yet, our results also show that the performance of the prover crucially depends on the
logical representation of its input problem and the deployed reasoning strategies during proof search.
As such, users and developers of firstorder provers, and automated reasoners in general, typically face the burden of analysing (failed) proof attempts produced by the prover,
with the ultimate goal to refine the input and/or proof strategies making the prover succeed in proving its input.
Understanding (some of) the reasons why the prover failed is however very hard and requires a considerable
amount of work by highly qualified experts in theorem proving,
hindering thus the use of theorem provers in many application domains.
In this paper we address this challenge and introduce the SatVis tool to ease the task of
analysing failed proof attempts in saturationbased reasoning. We designed SatVis to support
interactive visualization of the saturation algorithm used in Vampire,
with the goal to ease the manual analysis of Vampire proofs as well as failed proof attempts in Vampire.
Inputs to SatVis are proof (attempts) produced by Vampire.
Our tool consists of (i) an explicit visualization of the DAGstructure of the saturation proof (attempt) of Vampire
and (ii) interactive transformations of the DAG for pruning and reformatting the proof (attempt).
In its current setting, SatVis can be used only in the context of Vampire.
Yet, by parsing/translating proofs (or proof attempts) of other provers into the Vampire proof format,
SatVis can be used in conjunction with other provers as well.
When feeding Vampire proofs to SatVis, SatVis supports both users and developers of Vampire to understand and refactor Vampire proofs, and to manually proof check soundness of Vampire proofs. When using SatVis on failed proof attempts of Vampire, SatVis supports users and developers of Vampire to analyse how Vampire explored its search space during proof search, that is, to understand which clauses were derived and why certain clauses have not been derived at various steps during saturation. By doing so, the SatVis proof visualisation framework gives valuable insights on how to revise the input problem encoding of Vampire and/or implement domainspecific optimizations in Vampire. We therefore believe that SatVis improves the stateoftheart in the use and applications of theorem proving at least in the following scenarios: (i) helping Vampire developers to debug and further improve Vampire, (ii) helping Vampire users to tune Vampire to their applications, by not treating Vampire as a blackbox but by understanding and using its appropriate proof search options; and (iii) helping unexperienced users in saturationbased theorem proving to learn using Vampire and firstorder proving in general.
Contributions.
The contribution of this paper comes with the design of the SatVis tool for analysing proofs, as well as proof attempts of the Vampire theorem prover. SatVis is available at:
We overview proof search steps in Vampire specific to SatVis (Section 2), discuss the challenges we faced for analysing proof attempts of Vampire (Section 3), and describe implementationlevel details of SatVis 1.0 (Section 4).
Related work.
While standardizing the input format of automated reasoners is an active research topic, see e.g. the SMTLIB [2] and TPTP [14] standards, coming up with an input standard for representing and analysing proofs and proof attempts of automated reasoners has received so far very little attention. The TSTP library [14] provides input/output standards for automated theorem proving systems. Yet, unlike SatVis, TSTP does not analyse proof attempts but only supports the examination of firstorder proofs. We note that Vampire proofs (and proof attempts) contain firstorder formulas with theories, which is not fully supported by TSTP.
Using a graphlayout framework, for instance Graphviz [6], it is relatively straightforward to visualize the DAG derivation graph induced by a saturation attempt of a firstorder prover. For example, the theorem prover E [13] is able to directly output its saturation attempt as an input file for Graphviz. The visualizations generated in this way are useful however only for analyzing small derivations with at most 100 inferences, but cannot practically be used to analyse and manipulate larger proof attempts. We note that it is quite common to have firstorder proofs and proof attempts with more than 1,000 or even 10,000 inferences, especially in applications of theorem proving in software verification, see e.g. [3]. In our SatVis framework, the interactive features of our tool allow one to analyze such large(r) proof attempts.
The framework [12] eases the manual analysis of proof attempts in Z3 [5] by visualizing quantifier instantiations, case splits and conflicts. While both [12] and SatVis are built for analyzing (failed) proof attempts, they target different architectures (SMTsolving resp. superpositionbased proving) and therefore differ in their input format and in the information they visualize. The frameworks [4, 10] visualize proofs derived in a natural deduction/sequent calculus. Unlike these approaches, SatVis targets clausal derivations generated by saturationbased provers using the superposition inference system. As a consequence, our tool can be used to focus only on the clauses that have been actively used during proof search, instead of having to visualize the entire set of clauses, including unused clauses during proof search. We finally note that proof checkers, such as DRATtrim [15], support the soundness analysis of each inference step of a proof, and do not focus on failing proof attempts nor do they visualize proofs.
2 Proof Search in Vampire
We first present the key ingredients for proof search in Vampire, relevant to analysing saturation attempts.
Derivations and proofs.
An inference is a tuple , where are formulas. The formulas are called the premises of and is called the conclusion of . In our setting, an inference system is a set of inferences and we rely on the superposition inference systems [11]. An axiom of an inference system is any inference with . Given an inference system , a derivation from axioms is an acyclic directed graph (DAG), where (i) each node is a formula and (ii) each node either is an axiom in and does not have any incoming edges, or is a formula , such that the incoming edges of are exactly and there exists an inference . A refutation of axioms is a derivation which contains the empty clause as a node. A derivation of a formula is called a proof of if it is finite and all leaves in the derivation are axioms.
Proof search in Vampire.
Given an input set of axioms and a conjecture , Vampire searches for a refutation of , by using a preprocessing phase followed by a saturation phase.
In the preprocessing phase, Vampire generates a derivation from such that each sinknode of the DAG^{2}^{2}2a sinknode is a node such that no edge emerges out of it. is a clause. Then, Vampire enters the saturation phase, where it extends the existing derivation by applying its saturation algorithm using the sinknodes from the preprocessing phase as the input clauses to saturation. The saturation phase of Vampire terminates in either of the following three cases:
(i) the empty clause is derived (hence, a proof of was found), (ii) no more clauses are derived and the empty clause was not derived (hence, the input is saturated and is satisfiable),
or (iii) an a priory given time/memory limit on the Vampire run is reached (hence, it is unknown whether is satisfiable/valid).
Saturationbased proving in Vampire is performed using the following highlevel description of the saturation phase of Vampire. The saturation algorithm divides the set of clauses from the proof space of Vampire into a set of and clauses, and iteratively refines these sets using its superposition inference system: the set keeps the clauses between which all possible inferences have been performed, whereas the set stores the clauses which have not been added to yet and are candidates for being used in future steps of the saturation algorithm. During saturation, Vampire distinguishes between socalled simplifying and generating inferences. Intuitively, simplifying inferences delete clauses from the search space and hence are crucial for keeping the search space small. A generating inference is a nonsimplifying one, and hence adds new clauses to the search space. As such, at every iteration of the saturation algorithm, a new clause from is selected and added to , after which all generating inferences between the selected clause and the clauses in are applied. Conclusions of these inferences yield new clauses which are added to to be selected in future iterations of saturation. Additionally at any step of the saturation algorithm, simplifying inferences and deletion of clauses are allowed.
3 Analysis of Saturation Attempts of Vampire
We now discuss how to efficiently analyze saturation attempts of Vampire in SatVis.
Analyzing saturation attempts.
To understand saturation (attempts), we have to analyze the generating inferences performed during saturation (attempts).
On the one hand, we are interested in the useful clauses: that is, the derived and activated clauses that are part of the proof we expect Vampire to find. In particular, we check whether these clauses occur in . (i) If this is the case for a given useful clause (or a simplified variant of it), we are done with processing this useful clause and optionally check the derivation of that clause against the expected derivation. (ii) If not, we have to identify the reason why the clause was not added to , which can either be the case because (ii.a) the clause (or a simplified version of it) was never chosen from to be activated or (ii.b) the clause was not even added to . In case (ii.a), we investigate why the clause was not activated. This involves checking which simplified version of the clause was added to and checking the value of clause selection in Vampire on that clause. In case (ii.b), it is needed to understand why the clause was not added to , that is, why no generating inference between suitable premise clauses was performed. This could for instance be the case because one of the premises was not added to , in which case we recurse with the analysis on that premise, or because clause selection in Vampire prevented the inference.
On the other hand, we are interested in the useless clauses: that is, the clauses which were generated or even activated but are unrelated to the proof Vampire will find. These clauses often slow down the proof search by several magnitudes. It is therefore crucial to limit their generation or at least their activation. To identify the useless clauses that are activated, we need to analyze the set , whereas to identify the useless clauses, which are generated but never activated, we have to investigate the set .
[fontsize=] … [SA] passive: 160. v = a(l11(s(nl8)),
Saturation output.
We now discuss how SatVis reconstructs the clause sets and from a Vampire saturation (attempt). Vampire
is able to log a list of events, where each event is classified as either (i) new
(ii) passive or (iii) active , for a given clause . The list of events produced by Vampire satisfies the following properties: (a) any clause is at most once newly created, added to and added to ; (b) if a clause is added to , it was newly created in the same iteration, and (c) if a clause is added to , it was newly created and added to at some point. Figure 3 shows a part of the output logged by Vampire while performing a saturation attempt (SA).Starting from an empty derivation and two empty sets, the derivation graph and the sets and corresponding to a given saturation attempt of Vampire are computed in SatVis by traversing the list of events produced by Vampire and iteratively changing the derivation and the sets and , as follows:

[label=()]

new : add the new node to the derivation and construct the edges , for any premise of the inference deriving . The sets or remain unchanged;

passive : add the node to . The derivation and remain unchanged;

active : remove the node from and add it to . The derivation remains unchanged.
Interactive Visualization.
The large number of inferences during saturation in Vampire makes the direct analysis of saturation attempts of Vampire impossible within a reasonable amount of time. In order to overcome this problem, in SatVis we interactively visualize the derivation graph of the Vampire saturation. The graphbased visualization of SatVis brings the following benefits:
Navigating through the graph visualization of a Vampire derivation is easier for users rather than working with the Vampire derivation encoded as a list of hyperedges. In particular, both (i) navigating to the premises of a selected node/clause and (ii) searching for inferences having a selected node/clause as premise is performed fast in SatVis.
SatVis visualizes only the nodes/clauses that are part of a derivation of an activated clause, and in this way ignores uninteresting inferences.
SatVis merges the preprocessing inferences, such that each clause resulting from preprocessing has as direct premise the input formula it is derived from.
Yet, a straightforward graphbased visualization of Vampire saturations in SatVis would bring the following practical limitations on using SatVis:
(i) displaying additional metainformation on graph nodes, such as the inference rule used to derive a node, is computationally very expensive, due to the large number of inferences used during saturation;
(ii) manual search for particular/already processed nodes in relatively large derivations would take too much time;
(iii) subderivations are often interleaved with other subderivations due to an imperfect automatic layout of the graph.
SatVis addresses the above challenges using its following interactive features:

SatVis displays metainformation only for a selected node/clause;

SatVis supports different ways to locate and select clauses, such as fulltext search, search for direct children and premises of the currently selected clauses, and search for clauses whose derivation contains all currently selected nodes;

SatVis supports transformations/fragmentations of derivations. In particular, it is possible to restrict and visualize the derivation containing only the clauses that form the derivation of a selected clause, or visualize only clauses whose derivation contains a selected clause.

SatVis allows to (permanently) highlight one or more clauses in the derivation.
Figure 2 illustrates some of the above feature of SatVis, using output from Vampire similar to Figure 3 as input to SatVis.
4 Implementation of SatVis 1.0
We implemented SatVis as a web application, allowing SatVis to be easily used on any platform. Written in Python3, SatVis contains about 2,200 lines of code. For the generation of graph layouts, we rely on pygraphviz^{3}^{3}3https://pygraphviz.github.io, whereas graph/derivation visualizations are created with vis.js^{4}^{4}4https://visjs.org/. We experimented with SatVis on the verification examples of [3], using an Intel Core i5 3.1Ghz machine with 16 GB of RAM, allowing us to refine and successfully generate Vampire proofs for noninterference and informationflow examples of [3].
SatVis workflow.
SatVis takes as input a text file containing the output of a Vampire saturation attempt. An example of a partial input to SatVis is given in Figure 3. SatVis then generates a DAG representing the derivation of the considered Vampire saturation output, as presented in Section 3 and discussed later. Next, SatVis generates the graph layout of for the generated DAG, enriched with configured style information. Finally, SatVis renders and visualizes the Vampire derivation corresponding to its input, and allows interactive visualisations of its output, as discussed in Section 3 and detailed below.
DAG generation of saturation outputs.
SatVis
parses its input line by line using regex pattern matching in order to generate the nodes of the graph. Next,
SatVis uses a post order traversal algorithm to sanitize nodes and remove redundant ones. The result is then passed to pygraphviz to generate a graph layout. While pygraphviz finds layouts for thousands of nodes within less than three seconds, we would like to improve the scalability of the tool further.It would be beneficial to preprocess and render nodes incrementally, while ensuring stable layouts for SatVis graph transformations. We leave this engineering task for future work.
Interactive visualization
The interactive features of SatVis support (i) various node searching mechanisms, (ii) graph transformations, and (iii) the display of metainformation about a specific node. We can efficiently search for nodes by (partial) clause, find parents or children of a node, and find common consequences of a number of nodes. Graph transformations in SatVis allow to only render a certain subset of nodes from the SatVis DAG, for example, displaying only transitive parents or children of a certain node.
5 Conclusion
We described the SatVis tool for interactively visualizing proofs and proof attempts of the firstorder theorem prover Vampire. Our work analyses proof search in Vampire and reconstructs firstorder derivations corresponding to Vampire proofs/proof attempts. The interactive features of SatVis ease the task of understanding both successful and failing proof attempts in Vampire and hence can be used to further develop and use Vampire both by experts and nonexperts in firstorder theorem proving.
Acknowledgements. This work was funded by the ERC Starting Grant 2014 SYMCAR 639270, the ERC Proof of Concept Grant 2018 SYMELS 842066, the Wallenberg Academy Fellowship 2014 TheProSE and the Austrian FWF project W1255N23.
References
 [1] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanović, T. King, A. Reynolds, and C. Tinelli. CVC4. In CAV, pages 171–177, 2011.
 [2] C. Barrett, P. Fontaine, and C. Tinelli. The SMTLIB standard: Version 2.6. Technical report, Department of Computer Science, The University of Iowa, 2017.
 [3] G. Barthe, R. Eilers, P. Georgiou, B. Gleiss, L. Kovacs, and M. Maffei. Verifying Relational Properties using Trace Logic. In FMCAD, 2019. To appear.
 [4] J. Byrnes, M. Buchanan, M. Ernst, P. Miller, C. Roberts, and R. Keller. Visualizing proof search for theorem prover development. ENTCS, 226:23 – 38, 2009.
 [5] L. De Moura and N. Bjørner. Z3: An efficient SMT solver. In TACAS, pages 337–340, 2008.
 [6] E. R. Gansner and S. C. North. An Open Graph Visualization System and its Applications to Software Engineering. Software Practice and Experience, 30(11):1203–1233, 2000.
 [7] E. Kotelnikov, L. Kovács, and A. Voronkov. A FOOLish Encoding of the Next State Relations of Imperative Programs. In IJCAR, pages 405–421, 2018.
 [8] L. Kovács, S. Robillard, and A. Voronkov. Coming to terms with quantified reasoning. In POPL, pages 260–270. ACM, 2017.
 [9] L. Kovács and A. Voronkov. FirstOrder Theorem Proving and Vampire. In CAV, pages 1–35, 2013.
 [10] T. Libal, M. Riener, and M. Rukhaia. Advanced Proof Viewing in ProofTool. In UITP, pages 35–47, 2014.
 [11] R. Nieuwenhuis and A. Rubio. ParamodulationBased Theorem Proving. In Handbook of Automated Reasoning, pages 371–443. 2001.
 [12] F. Rothenberger. Integration and analysis of alternative smt solvers for software verification. Master’s thesis, ETH Zurich, Zürich, 2016. Masterarbeit. ETH Zürich. 2016.
 [13] S. Schulz. E  a Brainiac Theorem Prover. AI Communications, 15(23):111–126, 2002.
 [14] G. Sutcliffe. TPTP, TSTP, CASC, etc. In CSR, pages 7–23, 2007.
 [15] N. Wetzler, M. J. H. Heule, and W. A. Hunt. Drattrim: Efficient checking and trimming using expressive clausal proofs. In SAT, pages 422–429, 2014.
4 Implementation of SatVis 1.0
We implemented SatVis as a web application, allowing SatVis to be easily used on any platform. Written in Python3, SatVis contains about 2,200 lines of code. For the generation of graph layouts, we rely on pygraphviz^{3}^{3}3https://pygraphviz.github.io, whereas graph/derivation visualizations are created with vis.js^{4}^{4}4https://visjs.org/. We experimented with SatVis on the verification examples of [3], using an Intel Core i5 3.1Ghz machine with 16 GB of RAM, allowing us to refine and successfully generate Vampire proofs for noninterference and informationflow examples of [3].
SatVis workflow.
SatVis takes as input a text file containing the output of a Vampire saturation attempt. An example of a partial input to SatVis is given in Figure 3. SatVis then generates a DAG representing the derivation of the considered Vampire saturation output, as presented in Section 3 and discussed later. Next, SatVis generates the graph layout of for the generated DAG, enriched with configured style information. Finally, SatVis renders and visualizes the Vampire derivation corresponding to its input, and allows interactive visualisations of its output, as discussed in Section 3 and detailed below.
DAG generation of saturation outputs.
SatVis
parses its input line by line using regex pattern matching in order to generate the nodes of the graph. Next,
SatVis uses a post order traversal algorithm to sanitize nodes and remove redundant ones. The result is then passed to pygraphviz to generate a graph layout. While pygraphviz finds layouts for thousands of nodes within less than three seconds, we would like to improve the scalability of the tool further.It would be beneficial to preprocess and render nodes incrementally, while ensuring stable layouts for SatVis graph transformations. We leave this engineering task for future work.
Interactive visualization
The interactive features of SatVis support (i) various node searching mechanisms, (ii) graph transformations, and (iii) the display of metainformation about a specific node. We can efficiently search for nodes by (partial) clause, find parents or children of a node, and find common consequences of a number of nodes. Graph transformations in SatVis allow to only render a certain subset of nodes from the SatVis DAG, for example, displaying only transitive parents or children of a certain node.
5 Conclusion
We described the SatVis tool for interactively visualizing proofs and proof attempts of the firstorder theorem prover Vampire. Our work analyses proof search in Vampire and reconstructs firstorder derivations corresponding to Vampire proofs/proof attempts. The interactive features of SatVis ease the task of understanding both successful and failing proof attempts in Vampire and hence can be used to further develop and use Vampire both by experts and nonexperts in firstorder theorem proving.
Acknowledgements. This work was funded by the ERC Starting Grant 2014 SYMCAR 639270, the ERC Proof of Concept Grant 2018 SYMELS 842066, the Wallenberg Academy Fellowship 2014 TheProSE and the Austrian FWF project W1255N23.
References
 [1] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanović, T. King, A. Reynolds, and C. Tinelli. CVC4. In CAV, pages 171–177, 2011.
 [2] C. Barrett, P. Fontaine, and C. Tinelli. The SMTLIB standard: Version 2.6. Technical report, Department of Computer Science, The University of Iowa, 2017.
 [3] G. Barthe, R. Eilers, P. Georgiou, B. Gleiss, L. Kovacs, and M. Maffei. Verifying Relational Properties using Trace Logic. In FMCAD, 2019. To appear.
 [4] J. Byrnes, M. Buchanan, M. Ernst, P. Miller, C. Roberts, and R. Keller. Visualizing proof search for theorem prover development. ENTCS, 226:23 – 38, 2009.
 [5] L. De Moura and N. Bjørner. Z3: An efficient SMT solver. In TACAS, pages 337–340, 2008.
 [6] E. R. Gansner and S. C. North. An Open Graph Visualization System and its Applications to Software Engineering. Software Practice and Experience, 30(11):1203–1233, 2000.
 [7] E. Kotelnikov, L. Kovács, and A. Voronkov. A FOOLish Encoding of the Next State Relations of Imperative Programs. In IJCAR, pages 405–421, 2018.
 [8] L. Kovács, S. Robillard, and A. Voronkov. Coming to terms with quantified reasoning. In POPL, pages 260–270. ACM, 2017.
 [9] L. Kovács and A. Voronkov. FirstOrder Theorem Proving and Vampire. In CAV, pages 1–35, 2013.
 [10] T. Libal, M. Riener, and M. Rukhaia. Advanced Proof Viewing in ProofTool. In UITP, pages 35–47, 2014.
 [11] R. Nieuwenhuis and A. Rubio. ParamodulationBased Theorem Proving. In Handbook of Automated Reasoning, pages 371–443. 2001.
 [12] F. Rothenberger. Integration and analysis of alternative smt solvers for software verification. Master’s thesis, ETH Zurich, Zürich, 2016. Masterarbeit. ETH Zürich. 2016.
 [13] S. Schulz. E  a Brainiac Theorem Prover. AI Communications, 15(23):111–126, 2002.
 [14] G. Sutcliffe. TPTP, TSTP, CASC, etc. In CSR, pages 7–23, 2007.
 [15] N. Wetzler, M. J. H. Heule, and W. A. Hunt. Drattrim: Efficient checking and trimming using expressive clausal proofs. In SAT, pages 422–429, 2014.
5 Conclusion
We described the SatVis tool for interactively visualizing proofs and proof attempts of the firstorder theorem prover Vampire. Our work analyses proof search in Vampire and reconstructs firstorder derivations corresponding to Vampire proofs/proof attempts. The interactive features of SatVis ease the task of understanding both successful and failing proof attempts in Vampire and hence can be used to further develop and use Vampire both by experts and nonexperts in firstorder theorem proving.
Acknowledgements. This work was funded by the ERC Starting Grant 2014 SYMCAR 639270, the ERC Proof of Concept Grant 2018 SYMELS 842066, the Wallenberg Academy Fellowship 2014 TheProSE and the Austrian FWF project W1255N23.
References
 [1] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanović, T. King, A. Reynolds, and C. Tinelli. CVC4. In CAV, pages 171–177, 2011.
 [2] C. Barrett, P. Fontaine, and C. Tinelli. The SMTLIB standard: Version 2.6. Technical report, Department of Computer Science, The University of Iowa, 2017.
 [3] G. Barthe, R. Eilers, P. Georgiou, B. Gleiss, L. Kovacs, and M. Maffei. Verifying Relational Properties using Trace Logic. In FMCAD, 2019. To appear.
 [4] J. Byrnes, M. Buchanan, M. Ernst, P. Miller, C. Roberts, and R. Keller. Visualizing proof search for theorem prover development. ENTCS, 226:23 – 38, 2009.
 [5] L. De Moura and N. Bjørner. Z3: An efficient SMT solver. In TACAS, pages 337–340, 2008.
 [6] E. R. Gansner and S. C. North. An Open Graph Visualization System and its Applications to Software Engineering. Software Practice and Experience, 30(11):1203–1233, 2000.
 [7] E. Kotelnikov, L. Kovács, and A. Voronkov. A FOOLish Encoding of the Next State Relations of Imperative Programs. In IJCAR, pages 405–421, 2018.
 [8] L. Kovács, S. Robillard, and A. Voronkov. Coming to terms with quantified reasoning. In POPL, pages 260–270. ACM, 2017.
 [9] L. Kovács and A. Voronkov. FirstOrder Theorem Proving and Vampire. In CAV, pages 1–35, 2013.
 [10] T. Libal, M. Riener, and M. Rukhaia. Advanced Proof Viewing in ProofTool. In UITP, pages 35–47, 2014.
 [11] R. Nieuwenhuis and A. Rubio. ParamodulationBased Theorem Proving. In Handbook of Automated Reasoning, pages 371–443. 2001.
 [12] F. Rothenberger. Integration and analysis of alternative smt solvers for software verification. Master’s thesis, ETH Zurich, Zürich, 2016. Masterarbeit. ETH Zürich. 2016.
 [13] S. Schulz. E  a Brainiac Theorem Prover. AI Communications, 15(23):111–126, 2002.
 [14] G. Sutcliffe. TPTP, TSTP, CASC, etc. In CSR, pages 7–23, 2007.
 [15] N. Wetzler, M. J. H. Heule, and W. A. Hunt. Drattrim: Efficient checking and trimming using expressive clausal proofs. In SAT, pages 422–429, 2014.
References
 [1] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanović, T. King, A. Reynolds, and C. Tinelli. CVC4. In CAV, pages 171–177, 2011.
 [2] C. Barrett, P. Fontaine, and C. Tinelli. The SMTLIB standard: Version 2.6. Technical report, Department of Computer Science, The University of Iowa, 2017.
 [3] G. Barthe, R. Eilers, P. Georgiou, B. Gleiss, L. Kovacs, and M. Maffei. Verifying Relational Properties using Trace Logic. In FMCAD, 2019. To appear.
 [4] J. Byrnes, M. Buchanan, M. Ernst, P. Miller, C. Roberts, and R. Keller. Visualizing proof search for theorem prover development. ENTCS, 226:23 – 38, 2009.
 [5] L. De Moura and N. Bjørner. Z3: An efficient SMT solver. In TACAS, pages 337–340, 2008.
 [6] E. R. Gansner and S. C. North. An Open Graph Visualization System and its Applications to Software Engineering. Software Practice and Experience, 30(11):1203–1233, 2000.
 [7] E. Kotelnikov, L. Kovács, and A. Voronkov. A FOOLish Encoding of the Next State Relations of Imperative Programs. In IJCAR, pages 405–421, 2018.
 [8] L. Kovács, S. Robillard, and A. Voronkov. Coming to terms with quantified reasoning. In POPL, pages 260–270. ACM, 2017.
 [9] L. Kovács and A. Voronkov. FirstOrder Theorem Proving and Vampire. In CAV, pages 1–35, 2013.
 [10] T. Libal, M. Riener, and M. Rukhaia. Advanced Proof Viewing in ProofTool. In UITP, pages 35–47, 2014.
 [11] R. Nieuwenhuis and A. Rubio. ParamodulationBased Theorem Proving. In Handbook of Automated Reasoning, pages 371–443. 2001.
 [12] F. Rothenberger. Integration and analysis of alternative smt solvers for software verification. Master’s thesis, ETH Zurich, Zürich, 2016. Masterarbeit. ETH Zürich. 2016.
 [13] S. Schulz. E  a Brainiac Theorem Prover. AI Communications, 15(23):111–126, 2002.
 [14] G. Sutcliffe. TPTP, TSTP, CASC, etc. In CSR, pages 7–23, 2007.
 [15] N. Wetzler, M. J. H. Heule, and W. A. Hunt. Drattrim: Efficient checking and trimming using expressive clausal proofs. In SAT, pages 422–429, 2014.