1 Introduction
Constrained Horn Clauses (CHC) have over the last decade emerged as a uniform framework for reasoning about different aspects of software safety [4, 2]. Constrained Horn clauses form a fragment of firstorder logic, modulo various background theories, in which models can be constructed effectively with the help of techniques including model checking, abstract interpretation, or clause transformation. Horn clauses can be used as an intermediate verification language that elegantly captures various classes of systems (e.g., sequential code, programs with functions and procedures, concurrent programs, or reactive systems) and various verification methodologies (e.g., the use of state invariants, verification with the help of contracts, OwickiGriesstyle invariants, or relyguarantee methods). Horn solvers can be used as offtheshelf backends in verification tools, and thus enable construction of verification systems in a modular way.
is the fourth competition of solvers for Constrained Horn Clauses, a competition affiliated with the 8th Workshop on Horn Clauses for Verification and Synthesis (HCVS) at ETAPS 2021. The goal of CHCCOMP is to compare stateoftheart tools for Horn solving with respect to performance and effectiveness on realistic, publicly available benchmarks. The deadline for submitting solvers to was March 18 2021, resulting in 7 solvers participating, which were evaluated in the second half of March 2021. The 7 solvers were evaluated in 7 separate tracks on problems in linear integer arithmetic, linear real arithmetic, the theory of arrays, and theories of algebraic datatypes. The results of the competition can be found in Section 6 of this report, and were presented at the (virtual) HCVS workshop on March 28 2021.
1.1 Acknowledgements
We would like to thank the HCVS chairs, Bishoksan Kafle and Hossein Hojjat, for hosting CHCCOMP also in this year!
heavily built on the infrastructure developed for the previous instances of CHCCOMP, run by Arie Gurfinkel, Grigory Fedyukovich, and Philipp Rümmer, respectively. Contributors to the competition infrastructure also include Adrien Champion, Dejan Jovanovic, and Nikolaj Bjørner.
Like in the first three competitions, was run on StarExec [21]. We are extremely grateful for the computing resources and evaluation environment provided by StarExec, and for the fast and competent support by Aaron Stump and his team whenever problems occurred. would not have been possible without this!
Philipp Rümmer is supported by the Swedish Research Council (VR) under grant 201804727, by the Swedish Foundation for Strategic Research (SSF) under the project WebSec (Ref. RIT170011), and by the Knut and Alice Wallenberg Foundation under the project UPDATE.
2 Brief Overview of the Competition Design
2.1 Competition Tracks
Three new tracks were introduced in (namely, LIAnonlinarrays, LRATSpar, ADTnonlin), leading to altogether 7 tracks:

LIAnonlin: benchmarks with at least one nonlinear clause, and linear integer arithmetic as background theory;

LIAlin: benchmarks with only linear clauses, and linear integer arithmetic as background theory;

LIAnonlinarrays: benchmarks with at least one nonlinear clause, and the combined theory of linear integer arithmetic and arrays as background theory;

LIAlinarrays: benchmarks with only linear clauses, and the combined theory of linear integer arithmetic and arrays as background theory;

LRATS: benchmarks encoding transition systems, with linear real arithmetic as background theory. Benchmarks in this track have exactly one uninterpreted relation symbol, and exactly three linear clauses encoding initial states, transitions, and error states;

LRATSpar: same selection of benchmarks as in LRATS, but 2x4 CPU cores were reserved for each task, and the evaluation was done with wallclock time limit; this yields a setting benefiting parallel solvers;

ADTnonlin: benchmarks with at least one nonlinear clause, and the algebraic datatypes as background theory.
2.2 Computing Nodes
Two separate queues on StarExec were used for the competition, one queue with 15 nodes for the track LRATSpar, and one with 20 nodes for all other tracks. Each node had two quadcore CPUs. In LRATSpar, each job was run on its own node during the competition runs, while in the other tracks each node was used to run two jobs in parallel. The machine specifications are:
Intel(R) Xeon(R) CPU E52609 0 @ 2.40GHz (2393 MHZ) 10240 KB Cache 263932744 kB main memory # Software: OS: CentOS Linux release 7.7.1908 (Core) kernel: 3.10.01062.4.3.el7.x86_64 glibc: glibc2.17292.el7.x86_64 gcc4.8.539.el7.x86_64 glibc2.17292.el7.i686
2.3 Test and Competition Runs
The solvers submitted to were evaluated twice:

in a first set of test runs, in which (optional) presubmissions of the solvers were evaluated to check their configurations and identify possible inconsistencies. For the test runs a smaller set of randomly selected benchmarks was used. In the test runs, each solverbenchmark pair was limited to 600s CPU time, 600s wallclock time, and 64GB memory.

in the competition runs, the results of which determined the outcome of . The selection of the benchmarks for the competition runs is described in Section 4, and the evaluation of the competition runs in Section 2.4. In the competition run of LRATSpar, each job was limited to 1800s wallclock time, and 64GB memory. In the competition runs of all other tracks, each job was limited to 1800s CPU time, 1800s wallclock time, and 64GB memory.
2.4 Evaluation of the Competition Runs
The evaluation of the competition runs was in this year done using the summarize.py script available in the repository https://github.com/chccomp/scripts, and on the basis of the data provided by StarExec through the “job information” data export function. The ranking of solvers in each track was done based on the Score reached by the solvers in the competition run for that track. In case two solvers had equal Score, the ranking of the two solvers was determined by CPU time (for LRATSpar, by Wallclock time). It was assumed that the outcome of running one solver on one benchmark can only be sat, unsat, or unknown; the last outcome includes solvers giving up, running out of resources, or crashing.
The definition of Score, CPU time, and Wallclock time are:

Score: the number of sat or unsat results produced by a solver on the benchmarks of a track.

CPU time: the total CPU time needed by a solver to produce its answers in some track, including unknown answers.

Wallclock time: the total wallclock time needed by a solver to produce its answers in some track, including unknown answers.
In addition, the following feature is included in the results for each solver and each track:

#unique: The number of sat or unsat results produced by a solver for benchmarks for which all other solvers returned unknown.
We decided to not include the Space feature, specifying the total maximum virtual memory consumption, in the tables, since this number is less telling for solvers running in a JVM.
3 Competition Benchmarks
3.1 File Format
CHCCOMP represents benchmarks in a fragment of the SMTLIB 2.6 format. The fragment is defined on https://chccomp.github.io/format.html. The conformance of a welltyped SMTLIB script with the CHCCOMP fragment can be checked using the formatchecker available on https://github.com/chccomp/chctools.
3.2 Benchmark Processing in Tracks other than ADTnonlin
All benchmarks used in were preprocessed using the format.py script available in the repository https://github.com/chccomp/scripts, using the command line
> python3 format.py out_dir <outdir> merge_queries True <smtfile>
The script tries to translate arbitrary Hornlike problems in SMTLIB format to problems within the CHCCOMP fragment. Only benchmarks processed in this way were used in the competition.
The option merge_queries
has the effect of merging multiple
queries in a benchmark into a single query by introducing an auxiliary
nullary predicate. This transformation was introduced in CHCCOMP20, and
is discussed in [19].
After processing with format.py, benchmarks were checked and categorised into the four tracks using the formatchecker scripts available on https://github.com/chccomp/chctools.
Benchmarks that could not be processed by format.py were rejected by the formatchecker. Benchmarks that did not conform to any of the competition tracks, were not used in .
3.3 Benchmark Processing in ADTnonlin
Benchmarks used in the ADTnonlin track were preprocessed by eliminating all theory constraints and recursivelydefined functions. The transformation was performed using the feature of the RInGen tool [14]. This way, we were able to satisfy the inputlanguage constraints for all four tools entering the competition in this track. In the future, we, however, plan introducing other ADTrelated tracks with benchmarks over ADT and linear arithmetic and/or arrays.
3.4 Benchmark Inventory
Repository  LIAnonlin  LIAlin  LIAnonlinarrays  LIAlinarrays  LRATS  ADTnonlin  
adtpurified  67 /  67  
aeval  54 /  54  
eldaricamisc  69 /  66  147 /  134  
extrasmalllia  55 /  55  
hcai  135 /  133  100 /  86  25 /  25  39 /  39  
hopv  68 /  67  49 /  48  
jayhorn  5138 /  5084  75 /  73  
kind2  851 /  738  
ldvantmed  79 /  79  10 /  10  
ldvarrays  821 /  546  3 /  2  
llreve  59 /  57  44 /  44  31 /  31  
quic3  43 /  43  
ringen  439 /  439  
sally  177 /  174  
seahorn  68 /  66  3396 /  2822  
synth/nayhorn  119 /  114  
synth/semgus  5371* /  4839*  
tricera  4 /  4  405 /  405  
vmt  905 /  802  99 /  98  
chccomp19  271 /  265  325 /  313  15 /  15  290 /  290  228 /  226  
svcomp  1643 /  1169  3150 /  2932  855 /  779  79 /  73  
Total  8425 /  7763  8705 /  7768  7166 /  6283  495 /  488  504 /  498  506 /  506 
In contrast to most other competitions, CHCCOMP stores benchmarks in a decentralised way, in multiple repositories managed by the contributors of the benchmarks themselves. Table 1 summarises the number of benchmarks that were obtained by collecting benchmarks from all available repositories using the process in Section 3.2 and Section 3.3. Duplicate benchmarks were identified by computing a checksum for each (processed) benchmark, and were discarded.
The repository chccomp19benchmarks of benchmarks selected for CHCCOMP19 was included in the collection, because this repository contains several unique families of benchmarks that are not available in other repositories under https://github.com/chccomp. Such benchmarks include problems generated by the Ultimate tools in the LIAlinarrays track.
From jayhornbenchmarks, only the problems generated for svcomp2020 were considered, which subsume the problems for svcomp2019.
For ADTnonlin, benchmarks originate from the TIP suite (originally, designed for theoremproving) and verification of programs in functional languages.
4 Benchmark Rating and Selection
LIAnonlin  LIAlin  LIAnlarrays  
Repository  #A /  #B /  #C  #A /  #B /  #C  #A /  #B /  #C 
aeval  11 /  15 /  28  
eldaricamisc  35 /  4 /  27  105 /  20 /  9  
extrasmalllia  21 /  24 /  10  
hcai  74 /  44 /  15  73 /  8 /  5  14 /  6 /  5 
hopv  60 /  7 /  47 /  1 /  
jayhorn  2688 /  769 /  1627  73 /  /  
kind2  250 /  455 /  33  
ldvantmed  /  25 /  54  
ldvarrays  /  127 /  419  
llreve  35 /  13 /  9  37 /  5 /  2  
seahorn  38 /  19 /  9  977 /  985 /  860  
synth/nayhorn  46 /  30 /  38  
synth/semgus  282 /  768 /  1281  
tricera  4 /  /  28 /  14 /  363  
vmt  85 /  616 /  101  
chccomp19  144 /  80 /  41  80 /  101 /  132  /  7 /  8 
svcomp  1013 /  144 /  12  2801 /  17 /  114  258 /  268 /  253 
Total  4387 /  1565 /  1811  4338 /  1806 /  1624  554 /  1201 /  2020 
LIAnonlin  LIAlin  LIAnlarrays  LIAlinarrays  LRATS  ADTnonlin  
Repository  #Sel  #Sel  #Sel  #Selected  #Selected  #Selected  
adtpurified  67  
aeval  10  30  
eldaricamisc  10  30  15  39  
extrasmalllia  10  30  
hcai  20  55  15  28  5  15  39  
hopv  10  17  10  11  
jayhorn  30  90  10  10  
kind2  30  90  
ldvantmed  20  60  10  
ldvarrays  30  90  2  
llreve  15  37  10  17  31  
quic3  43  
ringen  111  
sally  174  
seahorn  15  39  30  90  
synth/nayhorn  20  60  
synth/semgus  20  60  45  135  
tricera  1  1  20  60  
vmt  30  90  98  
chccomp19  30  90  30  90  5  15  290  226  
svcomp  30  72  30  90  45  135  73  
Total  581  585  450  488  498  178 
This section describes how the benchmarks for were selected among the unique benchmarks summarised in Table 1. For the competition tracks LIAlinarrays, LRATS, and ADTnonlin, the benchmark library only contains 488, 498, and 506 unique benchmarks, respectively, which are small enough sets to use all benchmarks in the competition. For the tracks LIAnonlin, LIAlin, and LIAnonlinarrays, in contrast, too many benchmarks are available, so that a representative sample of the benchmarks had to be chosen.
To gauge the difficulty of the available problems in LIAnonlin, LIAlin, LIAnonlinarrays, a simple rating based on the performance of the CHCCOMP20 solvers was computed. The same approach was used in the last competition, CHCCOMP20, using solvers from CHCCOMP19. In this year, the two topranked competing solvers from CHCCOMP20 were run for a few seconds on each of the benchmarks:^{2}^{2}2Run on an Intel Core i5650 2core machine with 3.2GHz. All timeouts are in terms of wallclock time.

For LIAnonlin and LIAlin: Spacer (timeout 5s) and Eldaricaabs (timeout 10s);

For LIAnonlinarrays: Spacer (timeout 5s) and Ultimate Unihorn (timeout 10s). Since LIAnonlinarrays was not evaluated at CHCCOMP20, the topranked solvers from the track LIAlinarrays were chosen.
All solvers were run using the same binary and same options as in CHCCOMP20. For the JVMbased tools, Eldaricaabs and Ultimate Unihorn, the higher timeout was chosen to compensate for the JVM startup delay.
The outcomes of those test runs gave rise to three possible ratings for each benchmark:

A: both tools were able to determine the benchmark status within the given time budget.

B: only one tool could determine the benchmark status.

C: both tools timed out.
The number of benchmarks per rating are shown in Table 2. As can be seen from the table, the simple rating method separates the benchmarks into partitions of comparable size, and provides some information about the relative hardness of the problems in the different repositories.
From each repository , up to benchmarks were then selected randomly: benchmarks with rating A, benchmarks with rating B, and benchmarks with rating C. If a repository contained fewer than benchmarks for some particular rating, instead benchmarks with the nexthigher rating were chosen. As special cases, up to benchmarks were selected from repositories with only Arated benchmarks; up to benchmarks from repositories with only Brated benchmarks; and up to benchmarks from repositories with only Crated benchmarks.
The number was chosen individually for each repository, based on a manual inspection of the repository to judge the diversity of the contained benchmarks. The chosen , and the numbers of selected benchmarks for each repository, are given in Table 3.
For the actual selection of benchmarks with rating X, the following Unix command was used:
> cat <ratingXbenchmarklist>  sort R  head n <num>
The final set of benchmarks selected for can be found in the github repository https://github.com/chccomp/chccomp21benchmarks, and on StarExec in the public space CHC/CHCCOMP/chccomp21benchmarks.
5 Solvers Entering
Solver  LIAnonlin  LIAlin  LIAnonlinarrays  LIAlinarrays  LRATS  LRATSpar  ADTnonlin 
Golem  —  LIALin  —  —  LRATS  LRATS  — 
PCSat  pcsat_tb_ucore_ar  pcsat_tb_ucore_ar  —  —  —  —  pcsat_tb_ucore_reduce_quals 
Spacer  LIANONLIN  LIALIN  LIANONLINARRAYS  LIALINARRAYS  LRATS  LRATS  ADTNONLIN 
Ultimate TreeAutomizer  default  default  default  default  default  default  — 
Ultimate Unihorn  default  default  default  default  default  default  — 
RInGen  —  —  —  —  —  —  default 
Eldarica (Hors Concours)  def  def  def  def  —  —  def 
In total, 7 solvers were submitted to : 6 competing solvers, and one further solver (Eldarica, codeveloped by one of the competition organisers) that was entering outside of the competition. A summary of the participating solvers is given in Table 4.
More details about the participating solvers are provided in the solver descriptions in Section 8. The binaries of the solvers used for the competition runs can be found in the public StarExec space CHC/CHCCOMP/chccomp21benchmarks.
6 Competition Results
The winners and topranked solvers of the seven tracks are:
LIAnonlin  LIAlin  LIAnlarrays  LIAlinarrays  LRATS  LRATSpar  ADTnonlin  
Winner  Spacer  Spacer  Spacer  Spacer  Spacer  Spacer  Spacer 
Place 2  Ultimate Unihorn  Golem  Ultimate Unihorn  Ultimate Unihorn  Golem  Golem  RInGen 
Place 3  PCSat  Ultimate Unihorn  Ultimate TreeAutomizer  Ultimate TreeAutomizer  Ultimate TreeAutomizer  Ultimate TreeAutomizer  PCSat 
Detailed results for the seven tracks are provided in the tables on page 11.
6.1 Observed Issues and Fixes during the Competition Runs
Fixes in Spacer:
During the competition runs, it was observed that Spacer, in the version submitted by March 18, did not run correctly on StarExec and did not produce output for any of the benchmarks. Since this issue was discovered soon after the start of the competition runs, the organisers decided to let the Spacer authors submit a corrected version. The problem turned out to be compilation/linkingrelated, and the results presented in this report were produced with the fixed version of Spacer. To ensure fairness of the competition, all teams were given time until March 20 to submit revised versions of their tools.
Fixes in Golem:
One case of inconsistent results was observed in the competition runs
in the track LIAlin. For the benchmark chcLIALin_502.smt2
,
the tool Golem reported unsat, while Spacer and Eldarica reported
sat. The author of Golem confirmed that the inconsistency was due to
a bug in the loop acceleration in Golem, and could provide a corrected
version in which loop acceleration was switched off. The results
presented in this report were produced with this fixed version of
Golem.
To ensure fairness of the competition, we provide the following table comparing the results of the two versions of Golem in track LIAlin. The table shows that the fix in Golem led to marginally worse performance of the solver, and therefore did not put the other solvers at an unfair disadvantage:
#sat  #unsat  
Golem (original)  185  133 
Golem (fixed)  179  133 
7 Conclusions
The organisers would like to congratulate the general winner of this year’s CHCCOMP, the solver Spacer, as well as all solvers and tool authors for their excellent performance! Thanks go to everybody who has been helping with infrastructure, scripts, benchmarks, or in other ways, see the acknowledgements in the introduction; and to the HCVS workshop for hosting CHCCOMP!
The organisers also identified several questions and issues that should be discussed and addressed in the next editions, in order to keep CHCCOMP an interesting and relevant competition:

Models and counterexamples (as already discussed in [19]
). A concern brought up again at the HCVS workshop is the generation of models and/or counterexample certificates, highlighting the user demand for this functionality. Since at the moment many tools do not support certificates yet, this could initially happen in the scope of a new track, or by awarding a higher number of points for each produced and verified model/counterexample.

Multiquery benchmarks (as already discussed in [19]). We propose to extend the CHCCOMP fragment of SMTLIB to include also problems with multiple queries. This would leave the decision how to handle multiquery benchmarks to each solver. For solvers that can only solve problems with a single query, a script is available to transform multiquery problems to singlequery problems.

The LRATS track. This restricted track was created to enable also solvers that only support traditional transition systems to enter. However, no such solver was submitted to (in contrast to CHCCOMP20), which means that the results presented in this report do not fully reflect the state of the art for such problems. For future instances of CHCCOMP, it can be considered to replace LRATS with a general LRA track, dropping the restriction to problems in transition system form.

ADTnonlin: As mentioned in Sect. 3.3, the syntactic restrictions on the ADT tasks were needed to let more solvers participate in the competition. Thus, we had only “pure ADT” problems. However, as the technology evolves, we expect more solvers to participate in the next editions of the competition. Thus, ADT tasks that also use constraints in other theories (if collected in sufficient amounts) could form new tracks.

A bigger set of benchmarks is needed, and all users and tool authors are encouraged to submit benchmarks! In particular, in the LIAnonlin, LRATS, and ADTnonlin tracks, the competition results indicate that more and/or harder benchmarks are required.
8 Solver Descriptions
The tool descriptions in this section were contributed by the tool submitters, and the copyright on the texts remains with the individual authors.
Golem
Martin Blicha 
Università della Svizzera italiana, Switzerland 
Algorithm.
Golem is a new CHC solver, still under active development. It can solve systems of linear clauses with Linear Real or Integer Arithmetic as the background theory and it is able to provide witnesses for both satisfiable and unsatisfiable systems.
Its current reasoning engine is a reimplementation of the Impact algorithm [16]
and thus falls into the category of interpolationbased modelchecking approaches.
Architecture and Implementation.
Golem is implemented in C++ and built on top of the interpolating SMT solver OpenSMT [10] which is used for both satisfiability solving and interpolation. The only dependencies are those inherited from OpenSMT: Flex, Bison and GMP libraries.
Configuration in .
Golem was run with its default settings, except that its experimental loop acceleration module had to be disabled, because it contained a bug in the submitted version. Note that the SMT theory needs to be specified.
$ golem logic QF_LRA accelerateloops=false
$ golem logic QF_LIA accelerateloops=false
http://verify.inf.usi.ch/golem
MIT LICENSE
PCSat
Yu Gu 
University of Tsukuba, Japan 
Hiroshi Unno 
University of Tsukuba, Japan 
Algorithm.
PCSat is a solver for a general class of secondorder constraints. Its applications include but not limited to branchingtime temporal verification, relational verification, dependent refinement type inference, program synthesis, and infinitestate game solving.
PCSat is based on CounterExampleGuided Inductive Synthesis (CEGIS), with the support of multiple synthesis engines including templatebased [22]
, decisiontreebased
[15], and graphicalmodelbased [20] ones.Architecture and Implementation.
PCSat is designed and implemented as a highlyconfigurable solver, allowing us to test various combinations of synthesis engines, example sampling methods, template refinement strategies, and qualifier generators. This design is enabled by a powerful module system and metaprogramming features of the OCaml functional programming language. PCSat uses Z3 as the backend SMT solver.
News in 2021.
We supported the theory of algebraic datatypes and implemented a preprocessor for eliminating irrelevant arguments of predicates.
Configuration in .
PCSat is run with the solver configuration file “pcsattbucorear.json” in the LIANonlin and LIALin tracks and “pcsattbucorereducequals.json” in the ADTNonlin track. Both configurations enable the templatebased synthesis engine.
https://github.com/hiroshiunno/coar
Apache License 2.0
RInGen v1.1
Yurii Kostyukov 
Saint Petersburg State University, JetBrains Research, Russia 
Dmitry Mordvinov 
Saint Petersburg State University, JetBrains Research, Russia 
Algorithm.
RInGen stands for a Regular Invariant Generator, where regular invariants [14] are represented by finite tree automata. While invariant representations based on firstorder logic (FOL) can only access finitely many subterms, regular invariants have an ability to “scan” an ADT term to the unbounded depth via automaton rules. Tree automata also enjoy useful decidability properties and the corresponding regular tree languages are closed under all set operations, which makes regular invariants a promising alternative to FOLbased invariant representations.
RInGen rewrites a system of CHCs over ADTs into a formula over uninterpreted function symbols by eliminating all disequalities, testers, and selectors from the clause bodies. Then the satisfiability modulo theory of ADTs is reduced to satisfiability modulo theory of uninterpreted functions with equality (EUF). After that, an offtheshelf finite model finder is applied to build a finite model of the reduced verification conditions. Finally, using the correspondence between finite models and tree automata, the automaton representing the safe inductive invariant of the original system is obtained. Full algorithmic details of the RInGen can be found in [14].
Architecture and Implementation.
RInGen accepts input in the SMTLIB2 format and produces CHCs over pure ADT sorts in SMTLIB2 and Prolog. It takes conditions with a property as input and checks if the property holds, returning SAT and a safe inductive invariant, or terminates with UNSAT otherwise. We exploit CVC4 (using cvc4 finitemodelfind) at the backend to find regular models. Besides regular models, a finite model finding approach of CVC4 [17] v1.8 based on quantifier instantiation provides us with sound satisfiability checking.
Configuration in .
The tool is built and run with the following arguments:
solve timelimit $tlimit quiet outputdirectory "$dir" cvc4f "$input".
https://github.com/Columpio/RInGen/releases/tag/v1.1
BSD 3Clause License
Spacer
Hari Govind V K 
University of Waterloo, Canada 
Arie Gurfinkel 
University of Waterloo, Canada 
Algorithm.
Spacer [13] is an IC3/PDRstyle algorithm for solving linear and nonlinear CHCs. Given a set of CHCs, it iteratively proves the unreachability of false at larger and larger depths until a model is found or the set of CHCs is proven unsatisfiable. To prove unreachability at a particular depth, Spacer recursively generates sets of predecessor states (called proof obligations (pobs)) from which false can be derived and blocks them. Once a pobs is blocked, Spacer generalizes the proof to learn a lemma that blocks multiple pobs. Spacer
uses many heuristics to learn lemmas. These include interpolation, inductive generalization and quantifier generalization. The latest version of Spacer presents a new heuristic for learning lemmas
[11, 12].The current implementation of Spacer supports linear and nonlinear CHCs in the theory of Arrays, Linear Arithmetic, FixedSizeBitVectors, and Algebraic Data Types. Spacer can generate both quantified and quantifier free models as well as resolution proof of unsatisfiability.
Architecture and Implementation.
Spacer is implemented on top of the Z3 theorem prover. It uses many SMT solvers implemented in Z3. Additionally, it implements an interpolating SMT solver.
Configuration in .
Spacer has several configurations. The following options are common to all configurations:
fp.xform.tail_simplifier_pve=false fp.validate=true fp.spacer.mbqi=false fp.spacer.use_iuc=true
To activate global guidance [11], we use the following options:
fp.spacer.global=true fp.spacer.concretize=true fp.spacer.conjecture=true fp.spacer.expand_bnd=true
To activate quantifier generalization [5], we use:
fp.spacer.q3.use_qgen=true fp.spacer.q3.instantiate=true fp.spacer.q3=true fp.spacer.ground_pobs=false
In the arithmetic tracks (LRATS, LIALIN, LIANONLIN), we ran two threads in parallel. The first thread ran Spacer with global guidance. The second thread ran Z3’s BMC engine:
fp.engine=bmc
In the array tracks (LIALINARRAYS, LIANONLINARRAYS), we again ran two threads in parallel. The first thread had both global guidance and quantifier generalization. The second thread had only quantifier generalization. In the ADT tracks (ADTLIN, ADTNONLIN), we ran one thread which used only global guidance. Additionally, for the ADT tracks, we turned off one of the optimizations in Spacer:
fp.spacer.use_inc_clause=false
https://github.com/Z3Prover/z3
MIT License
Ultimate TreeAutomizer 0.1.256b0a1c7
Matthias Heizmann 
University of Freiburg, Germany 
Daniel Dietsch 
University of Freiburg, Germany 
Jochen Hoenicke 
University of Freiburg, Germany 
Alexander Nutz 
University of Freiburg, Germany 
Andreas Podelski 
University of Freiburg, Germany 
Algorithm.
The Ultimate TreeAutomizer solver implements an approach that is based on tree automata [3]. In this approach potential counterexamples to satisfiability are considered as a regular set of trees. In an iterative CEGAR loop we analyze potential counterexamples. Real counterexamples lead to an unsat result. Spurious counterexamples are generalized to a regular set of spurious counterexamples and subtracted from the set of potential counterexamples that have to be considered. In case we detected that all potential counterexamples are spurious, the result is sat. The generalization above is based on tree interpolation and regular sets of trees are represented as tree automata.
Architecture and Implementation.
TreeAutomizer is a toolchain in the Ultimate framework. This toolchain first parses the CHC input and then runs the treeautomizer plugin which implements the above mentioned algorithm. We obtain tree interpolants from the SMT solver SMTInterpol^{3}^{3}3https://ultimate.informatik.unifreiburg.de/smtinterpol/ [8]. For checking satisfiability, we use the Z3 SMT solver^{4}^{4}4https://github.com/Z3Prover/z3. The tree automata are implemented in Ultimate’s automata library^{5}^{5}5https://ultimate.informatik.unifreiburg.de/automatalibrary. The Ultimate framework is written in Java and build upon the Eclipse Rich Client Platform (RCP). The source code is available at GitHub^{6}^{6}6https://github.com/ultimatepa/.
Configuration in .
Our StarExec archive for the competition is shipped with the bin/starexec_run_default shell script calls the Ultimate command line interface with theTreeAutomizer.xml toolchain file and the TreeAutomizerHopcroftMinimization.epf settings file. Both files can be found in toolchain (resp. settings) folder of Ultimate’s repository.
https://ultimate.informatik.unifreiburg.de/
LGPLv3 with a linking exception for Eclipse RCP
Ultimate Unihorn 0.1.256b0a1c7
Matthias Heizmann 
University of Freiburg, Germany 
Daniel Dietsch 
University of Freiburg, Germany 
Jochen Hoenicke 
University of Freiburg, Germany 
Alexander Nutz 
University of Freiburg, Germany 
Andreas Podelski 
University of Freiburg, Germany 
Algorithm.
Ultimate Unihorn reduces the satisfiability problem for a set of constraint Horn clauses to a software verfication problem. In a first step Unihorn applies a yet unpublished translation in which the constraint Horn clauses are translated into a recursive program that is nondeterministic and whose correctness is specified by an assert statement The program is correct (i.e., no execution violates the assert statement) if and only if the set of CHCs is satisfiable. For checking whether the recursive program satisfies its specification, Unihorn uses Ultimate Automizer [6] which implements an automatabased approach to software verification [7].
Architecture and Implementation.
Ultimate Unihorn is a toolchain in the Ultimate framework. This toolchain first parses the CHC input and then runs the chctoboogie plugin which does the translation from CHCs into a recursive program. We use the Boogie language to represent that program. Afterwards the default toolchain for verifying a recursive Boogie programs by Ultimate Automizer is applied. The Ultimate framework shares the libraries for handling SMT formulas with the SMTInterpol SMT solver. While verifying a program, Ultimate Automizer needs SMT solvers for checking satisfiability, for computing Craig interpolants and for computing unsatisfiable cores. The version of Unihorn that participated in the competition used the SMT solvers SMTInterpol^{7}^{7}7https://ultimate.informatik.unifreiburg.de/smtinterpol/and Z3^{8}^{8}8https://github.com/Z3Prover/z3. The Ultimate framework is written in Java and build upon the Eclipse Rich Client Platform (RCP). The source code is available at GitHub^{9}^{9}9https://github.com/ultimatepa/.
Configuration in .
Our StarExec archive for the competition is shipped with the bin/starexec_run_default shell script calls the Ultimate command line interface with theAutomizerCHC.xml toolchain file and the AutomizerCHC_No_Goto.epf settings file. Both files can be found in toolchain (resp. settings) folder of Ultimate’s repository.
https://ultimate.informatik.unifreiburg.de/
LGPLv3 with a linking exception for Eclipse RCP
Eldarica v2.0.6 (Hors Concours)
Zafer Esen 
Uppsala University, Sweden 
Hossein Hojjat 
University of Tehran, Iran 
Philipp Rümmer 
Uppsala University, Sweden 
Algorithm.
Eldarica [9]
is a Horn solver applying classical algorithms from model checking: predicate abstraction and counterexampleguided abstraction refinement (CEGAR). Eldarica can solve Horn clauses over linear integer arithmetic, arrays, algebraic datatypes, and bitvectors. It can process Horn clauses and programs in a variety of formats, implements sophisticated algorithms to solve tricky systems of clauses without diverging, and offers an elegant API for programmatic use.
Architecture and Implementation.
Eldarica is entirely implemented in Scala, and only depends on Java or Scala libraries, which implies that Eldarica can be used on any platform with a JVM. For computing abstractions of systems of Horn clauses and inferring new predicates, Eldarica invokes the SMT solver Princess [18] as a library.
News in 2021.
Compared to the last competition, Eldarica now uses a new array solver in the tracks LIAnonlinarrays and LIAlinarrays.
Configuration in .
Eldarica is in the competition run with the option abstractPO
,
which enables a simple portfolio mode: two instances of the solver are
run in parallel, one with the default options, and one with the option
abstract:off
to switch off the interpolation abstraction
technique.
https://github.com/uuverifiers/eldarica
BSD licence
References
 [1]
 [2] Nikolaj Bjørner, Arie Gurfinkel, Kenneth L. McMillan & Andrey Rybalchenko (2015): Horn Clause Solvers for Program Verification. In: Fields of Logic and Computation II  Essays Dedicated to Yuri Gurevich on the Occasion of His 75th Birthday, pp. 24–51, doi:http://dx.doi.org/10.1007/9783319235349˙2.
 [3] Daniel Dietsch, Matthias Heizmann, Jochen Hoenicke, Alexander Nutz & Andreas Podelski (2019): Ultimate TreeAutomizer (CHCCOMP Tool Description). In: HCVS/PERR@ETAPS, EPTCS 296, pp. 42–47, doi:http://dx.doi.org/10.4204/EPTCS.296.7.
 [4] Sergey Grebenshchikov, Nuno P. Lopes, Corneliu Popeea & Andrey Rybalchenko (2012): Synthesizing Software Verifiers from Proof Rules. In: PLDI, ACM, pp. 405–416, doi:http://dx.doi.org/10.1145/2254064.2254112.
 [5] Arie Gurfinkel, Sharon Shoham & Yakir Vizel (2018): Quantifiers on Demand. In Shuvendu K. Lahiri & Chao Wang, editors: Automated Technology for Verification and Analysis  16th International Symposium, ATVA 2018, Los Angeles, CA, USA, October 710, 2018, Proceedings, Lecture Notes in Computer Science 11138, Springer, pp. 248–266, doi:http://dx.doi.org/10.1007/9783030010904_15.
 [6] Matthias Heizmann, YuFang Chen, Daniel Dietsch, Marius Greitschus, Jochen Hoenicke, Yong Li, Alexander Nutz, Betim Musa, Christian Schilling, Tanja Schindler & Andreas Podelski (2018): Ultimate Automizer and the Search for Perfect Interpolants  (Competition Contribution). In: TACAS (2), LNCS 10806, Springer, pp. 447–451, doi:http://dx.doi.org/10.1007/9783319899633˙30.
 [7] Matthias Heizmann, Jochen Hoenicke & Andreas Podelski (2013): Software Model Checking for People Who Love Automata. In: CAV, LNCS 8044, Springer, pp. 36–52, doi:http://dx.doi.org/10.1007/9783642397998˙2.
 [8] Jochen Hoenicke & Tanja Schindler (2018): Efficient Interpolation for the Theory of Arrays. In: IJCAR, LNCS 10900, Springer, pp. 549–565, doi:http://dx.doi.org/10.1007/9783319942056˙36.
 [9] Hossein Hojjat & Philipp Rümmer (2018): The ELDARICA Horn Solver. In Nikolaj Bjørner & Arie Gurfinkel, editors: 2018 Formal Methods in Computer Aided Design, FMCAD, IEEE, pp. 1–7, doi:http://dx.doi.org/10.23919/FMCAD.2018.8603013.
 [10] Antti E. J. Hyvärinen, Matteo Marescotti, Leonardo Alt & Natasha Sharygina (2016): OpenSMT2: An SMT Solver for Multicore and Cloud Computing. In: Theory and Applications of Satisfiability Testing  SAT 2016  19th International Conference, Bordeaux, France, July 58, 2016, Proceedings, Lecture Notes in Computer Science 9710, Springer, pp. 547–553, doi:http://dx.doi.org/10.1007/9783319409702_35.
 [11] Hari Govind V K, YuTing Chen, Sharon Shoham & Arie Gurfinkel (2020): Global Guidance for Local Generalization in Model Checking. In: Computer Aided Verification  32nd International Conference, CAV 2020, Los Angeles, CA, USA, July 2124, 2020, Proceedings, Part II, Lecture Notes in Computer Science 12225, Springer, pp. 101–125, doi:http://dx.doi.org/10.1007/9783030532918_7.
 [12] Hari Govind V. K., Grigory Fedyukovich & Arie Gurfinkel (2020): Word Level Property Directed Reachability. In: IEEE/ACM International Conference On Computer Aided Design, ICCAD 2020, San Diego, CA, USA, November 25, 2020, IEEE, pp. 107:1–107:9, doi:http://dx.doi.org/10.1145/3400302.3415708.
 [13] Anvesh Komuravelli, Arie Gurfinkel & Sagar Chaki (2016): SMTbased Model Checking for Recursive Programs. Formal Methods Syst. Des. 48(3), pp. 175–205, doi:http://dx.doi.org/10.1007/s1070301602494.
 [14] Yurii Kostyukov, Dmitry Mordvinov & Grigory Fedyukovich (2021): Beyond the Elementary Representations of Program Invariants over Algebraic Data Types. In: PLDI ’21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, Virtual Event, Canada, June 2025, 20211, ACM, pp. 451–465, doi:http://dx.doi.org/10.1145/3453483.3454055.
 [15] Satoshi Kura, Hiroshi Unno & Ichiro Hasuo (2021): Decision Tree Learning in CEGISBased Termination Analysis. In: Computer Aided Verification  33rd International Conference, CAV 2021, Virtual Event, July 2023, 2021, Proceedings, Part II, Lecture Notes in Computer Science 12760, Springer, pp. 75–98, doi:http://dx.doi.org/10.1007/9783030816889_4.
 [16] Kenneth L. McMillan (2006): Lazy Abstraction with Interpolants. In Thomas Ball & Robert B. Jones, editors: Computer Aided Verification, 18th International Conference, CAV 2006, Seattle, WA, USA, August 1720, 2006, Proceedings, Lecture Notes in Computer Science 4144, Springer, pp. 123–136, doi:http://dx.doi.org/10.1007/11817963_14.
 [17] Andrew Reynolds, Cesare Tinelli, Amit Goel & Sava Krstić (2013): Finite model finding in SMT. In: International Conference on Computer Aided Verification, Lecture Notes in Computer Science 8044, Springer, pp. 640–655, doi:http://dx.doi.org/10.1007/9783642397998_42.

[18]
Philipp Rümmer
(2008): A Constraint Sequent Calculus
for FirstOrder Logic with Linear Integer Arithmetic.
In:
Proceedings, 15th International Conference on Logic for Programming, Artificial Intelligence and Reasoning
, LNCS 5330, Springer, pp. 274–289, doi:http://dx.doi.org/10.1007/9783540894391˙20.  [19] Philipp Rümmer (2020): Competition Report: CHCCOMP20. In Laurent Fribourg & Matthias Heizmann, editors: Proceedings 8th International Workshop on Verification and Program Transformation and 7th Workshop on Horn Clauses for Verification and Synthesis, VPT/HCVS@ETAPS 2020, Dublin, Ireland, 2526th April 2020, EPTCS 320, pp. 197–219, doi:http://dx.doi.org/10.4204/EPTCS.320.15.
 [20] Yuki Satake, Hiroshi Unno & Hinata Yanagi (2020): Probabilistic Inference for Predicate Constraint Satisfaction. In: The ThirtyFourth AAAI Conference on Artificial Intelligence, AAAI 2020, The ThirtySecond Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 712, 2020, AAAI Press, pp. 1644–1651, doi:http://dx.doi.org/10.1609/aaai.v34i02.5526.
 [21] Aaron Stump, Geoff Sutcliffe & Cesare Tinelli (2014): StarExec: A CrossCommunity Infrastructure for Logic Solving. In Stéphane Demri, Deepak Kapur & Christoph Weidenbach, editors: Automated Reasoning  7th International Joint Conference, IJCAR, LNCS 8562, Springer, pp. 367–373, doi:http://dx.doi.org/10.1007/9783319085876_28.
 [22] Hiroshi Unno, Tachio Terauchi & Eric Koskinen (2021): Constraintbased Relational Verification. In: Computer Aided Verification  33rd International Conference, CAV 2021, Virtual Event, July 2023, 2021, Proceedings, Part I, Lecture Notes in Computer Science 12759, Springer, pp. 742–766, doi:http://dx.doi.org/10.1007/9783030816858_35.