 # Schur Number Five

We present the solution of a century-old problem known as Schur Number Five: What is the largest (natural) number n such that there exists a five-coloring of the positive numbers up to n without a monochromatic solution of the equation a + b = c? We obtained the solution, n = 160, by encoding the problem into propositional logic and applying massively parallel satisfiability solving techniques on the resulting formula. We constructed and validated a proof of the solution to increase trust in the correctness of the multi-CPU-year computations. The proof is two petabytes in size and was certified using a formally verified proof checker, demonstrating that any result by satisfiability solvers---no matter how large---can now be validated using highly trustworthy systems.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## Introduction

In the beginning of the century, Issai Schur studied whether every coloring of the positive (natural) numbers with finitely many colors results in monochromatic solutions of the equation . This work gave rise to the concept of so-called Schur numbers: Schur number , denoted by , is defined as the largest number for which there exists a -coloring of the positive numbers up to with no monochromatic solution of .111An alternative definition used in the literature picks the smallest s.t. all -colorings of 1 to result in a monochromatic solution. The values of differ by one, depending on the definition. For example, : Assume we use the two colors red and blue. If we color with red, we have to color with blue due to . This forces us to color with red because of . After this, must become blue due to . But then, no matter if we color 5 with red or blue, we end up with a monochromatic solution of or .

Although Schur’s Theorem states that is finite for any finite value of  [Schur1917], determining the exact values of is an open problem in elementary number theory [Guy1994]. In fact, only the values , , , and have been known so far [Golomb and Baumert1965]

. We came up with a highly optimized automated-reasoning method for showing that

.

To obtain this solution, we first encoded the Schur Number Five problem into propositional logic and then applied satisfiability (SAT) solving techniques to solve the resulting formula. This approach has been successful in recent years, leading to the solution of hard open problems such as the problem of determining the sixth van der Waerden number [Kouril and Paul2008], the Erdős discrepancy problem [Konev and Lisitsa2015], and the Pythagorean triples problem [Heule, Kullmann, and Marek2016]. Trying to solve a SAT encoding for Schur Number Five with off-the-shelf SAT solving tools turned out to be a hopeless endeavor. We therefore came up with a dedicated approach, which is intended to be applicable to related problems as well. We modified existing tools to efficiently solve our encoding. Still, even with our optimized approach, the total computational effort to solve the problem was over CPU years.

If it takes a computer several CPU years to solve a problem, it is only natural to question the correctness of the supposed solution. To deal with this issue, we automatically constructed a proof of the propositional formula that encodes the main statement. The size of this proof is more than two petabytes, making it about ten times larger than “the largest math proof ever” [Lamb2016]. Despite its tremendous size, we were able to verify the correctness of the proof with a formally verified proof checker. Due to recent progress in proof validation [Cruz-Filipe, Marques-Silva, and Schneider-Kamp2017], checking the correctness of such proofs is now nearly as efficient as the actual construction of the proofs by a SAT solver [Cruz-Filipe et al.2017, Lammich2017]. In our case, the time spent on proof checking was a little more than CPU years.

The main contributions of this paper are as follows:

• We constructed a propositional formula that is satisfiable if and only if . Our proof of unsatisfiability for this formula is over two petabytes in size.

• We certified the proof using a program formally verified by ACL2 [Kaufmann and Moore1997], thereby providing high confidence in the correctness of our result.

• We enumerated all five-colorings of the numbers 1 to 160 without a monochromatic .

• We designed a decision heuristic that allows solving Schur number problems efficiently and enables linear-time speedups even when using thousands of CPUs.

• We developed an efficient hardness predictor for partitioning a hard problem into millions of easy subproblems.

## Schur Numbers and Variants

Schur number , denoted by , is defined as the largest (natural) number such that there exists a -coloring of the numbers to without a monochromatic solution of the equation with . The first Schur numbers , , and can be determined manually while was computed decades ago [Golomb and Baumert1965]. The best known lower bounds for higher Schur numbers are:  [Exoo1994], , and  [Fredricksen and Sweet2000]. We prove that .

The early upper bounds  [Schur1917] have later been improved to  [Irving1973]. Upper bounds on can also be obtained via the connection to the Ramsey numbers , which denote the smallest such that any -coloring of the edges of the fully connected graph on vertices yields a monochromatic triangle:  [Schur1917]. The first three numbers are known: , , and  [Greenwood and Gleason1955].

Several variants of Schur numbers have been proposed. The oldest variant, known as weak Schur number and denoted by , requires to be smaller than , thus weakening to  [Irving1973]. Hence, . Only the four smallest weak Schur numbers are known: , , , and  [Blanchard, Harary, and Reis2006].

Another variant is the modular Schur number , denoted by , asking for the largest such that a -coloring of the numbers to exists without a monochromatic solution of the equation with [Abbott and Wang1977]. This variant is stronger than the classical notion, hence . However, for all known Schur numbers it holds that and this equality is conjectured to hold in general [Abbott and Wang1977]. Our result implies the equality for .

An even stronger variant is the palindromic Schur number , denoted by , for which the numbers and with have the same color—except in case . This variant is also known as symmetric sum-free sets [Fredricksen and Sweet2000] and is mainly used to determine lower bounds for the weaker variants. We have in general. However, the numbers are equal for the known values. This is a new result for .

The big question is whether for any

. We can probably answer this question only if the answer is no. Already showing this equality for

is expected to be extremely challenging.

## Technical Background

Below we present the most important background concepts related to the more technical part of this paper.

#### Propositional logic.

We consider propositional formulas in conjunctive normal form (CNF), which are defined as follows. A literal is either a variable (a positive literal) or the negation of a variable  (a negative literal). The complementary literal of a literal is defined as if and if . A clause is a disjunction of literals. A formula is a conjunction of clauses. For a literal, clause, or formula , denotes the variables in . For convenience, we treat as a variable if is a literal, and as a set of variables otherwise.

#### Satisfiability.

An assignment is a function from a set of variables to the truth values 1 (true) and 0 (false). An assignment is total w.r.t. a formula if it assigns a truth value to all variables occurring in the formula; otherwise it is partial. A literal is satisfied (falsified) by an assignment if is positive and (, resp.) or if it is negative and (, resp.). We also denote with the conjuction of literals that are satisfied by that assignment; such a conjunction is called a cube. A clause is satisfied by an assignment if it contains a literal that is satisfied by . Finally, a formula is satisfied by an assignment if all its clauses are satisfied by . A formula is satisfiable if there exists an assignment that satisfies it; otherwise it is unsatisfiable. A formula entails a formula , denoted by , if every assignment that satisfies also satisfies . weakly entails , denoted by , if satisfiability of implies satisfiability of .

#### Proofs of Unsatisfiability.

It is easy to check that an alleged satisfying assignment is valid. However, a certificate that a formula has no solution (i.e., is unsatisfiable) can be huge and costly to validate. We produce proofs of unsatisfiability in the DRAT proof system [Järvisalo, Heule, and Biere2012], which is the standard in state-of-the-art SAT solving. Given a formula , a DRAT proof of unsatisfiability is a sequence of clauses where is the empty clause . For every clause , it must hold that is a resolution asymmetric tautology (RAT) with respect to . The addition of a RAT to a formula preserves satisfiability and since the empty clause is trivially unsatisfiable, a DRAT proof witnesses the unsatisfiability of the original formula . DRAT also allows the deletion of clauses from a formula to improve the performance of proof validation. Note that clause deletion preserves satisfiability.

## Encoding

To solve Schur Number Five, we first encode the existence of certificates as propositional formulas and then exploit the strength of a parallel SAT solver to efficiently determine whether these formulas are satisfiable. A certificate is a -coloring of the numbers to with no monochromatic solution of for . A certificate provides a lower bound for the corresponding Schur problem: . The size of a certificate is . An extreme certificate is a certificate of maximum size. Figure 1 shows some extreme certificates for the known Schur numbers as well as a palindromic certificate — which is also an extreme certificate following the presented upper bound result. There is one extreme certificate modulo symmetry (i.e., modulo permuting the colors) with . These certificates are palindromes and thus modular. There exist three extreme certificates modulo symmetry. All of them are modular and palindromes. They differ only regarding the color of number , which can have any color. There are extreme certificates modulo symmetry, of which are modular and palindromes [Fredricksen and Sweet2000].

To establish that , we need to show that there exists a certificate but no certificate . We thus define a family of propositional formulas , each of which encodes the existence of a certificate . A satisfying assignment of can be computed in less than a minute by enforcing that the initial numbers cannot have the last color [Fredricksen and Sweet2000]. The main challenge addressed in this paper is proving unsatisfiability of to show that . This requires many CPU years of computation even with optimized heuristics.

For the formula , we use Boolean variables with and . Intuitively, a variable is true if and only if number has color in the certificate. The formula has three kinds of clauses: positive, negative, and optional. The positive clauses encode that every number must have at least one color. They are of the form for . The negative clauses encode that for every solution of the equation , the numbers , , and cannot have the same color . They are of the form with and . Finally, the optional clauses encode that every number has at most one color. They are of the form for . A commonly used SAT preprocessing technique, called blocked clause elimination [Järvisalo, Biere, and Heule2012], would remove the optional clauses. However, the optional clauses are required when counting or enumerating certificates.

###### Example 1.

Formula consists of the following clauses:

The first line shows the positive clauses, the second and third line the negative clauses, and the last line the optional clauses. Notice that and are subsumed by and , respectively.

### Symmetry Breaking

A certificate symmetry for a Schur number problem is a mapping from any certificate onto another certificate of that problem. Schur number problems have the certificate symmetry that permutes the colors. Due to , SAT solvers would explore all color permutations when solving formulas . In the following, we describe how to fully and compactly break this symmetry by enforcing a lexicographical ordering on the colors [Crawford et al.1996].

Breaking the certificate symmetry for the first two colors is easy: We just assign the first color to number and the second color to number . Adding the unit clauses and to the formula will enforce this. Note that the two numbers must be colored differently because of the equation .

Breaking for the third color is more involved. At least one of the numbers , , and can have neither the first nor the second color due to . We break the symmetry of the third color as follows: If number has neither the first nor the second color, we color it with the third color. Otherwise, if number has neither the first nor the second color, we color number with the third color. Otherwise, we color number 5 with the third color. We picked number as starting point as it is more constrained due to the equation and the clause . It therefore allows a more compact symmetry-breaking predicate, which consists of the clauses , , , , , and . Finally, to distinguish between the fourth and the fifth color, we assign the fourth color to the first number that does not have the first, second, or third color. We encode this with clauses of the form . We require these clauses only for .

Generating the original formulas can be easily achieved with a dozen lines of code. In contrast, the addition of compact symmetry-breaking predicates is more complicated and may therefore result in errors. Let be the formula obtained from by adding symmetry-breaking predicates. To ensure correctness of the symmetry breaking, we constructed a proof, called the re-encoding proof, that the satisfiability of implies the satisfiability of .

## Decision Heuristics

We used the cube-and-conquer method [Heule et al.2012] for SAT solving as it is arguably the most effective method for solving very hard combinatorial problems. This method was also used for solving the Erdős discrepancy problem [Konev and Lisitsa2015] and the Pythagorean triples problem [Heule, Kullmann, and Marek2016].

Cube-and-conquer is a hybrid parallel SAT solving paradigm that combines look-ahead techniques [Heule and van Maaren2009] with conflict-driven clause learning (CDCL) [Marques-Silva, Lynce, and Malik2009]: Look-ahead techniques are used for splitting a given problem into many (millions or even billions of) subproblems which are then solved with CDCL solvers. Since the subproblems are independent, they can be easily solved in parallel without requiring communication.

The aim of look-ahead techniques is to find variable assignments that simplify a formula as much as possible. This is achieved with so-called look-aheads: A look-ahead on a literal with respect to a formula first assigns to true and then simplifies to obtain a formula . After this, it determines a heuristic value by computing the “difference” between and (details are given below). A variable is considered useful for splitting a formula if the look-aheads on both and have a high heuristic value. Typically, look-ahead techniques select the variable for which the product of the heuristic values of and is the largest.

The effectiveness of look-ahead heuristics depends on measuring the difference between the formula and the simplified formula . A reasonably effective measure, which is also easy to compute, is the difference in the number of variables: . This measure is used in the cube-and-conquer solver Treengeling [Biere2013], which solved most benchmarks of the SAT Competition 2016 [Balyo, Heule, and Järvisalo2017]. An alternative, more costly measure, considers the clauses that have been reduced, but not satisfied, during the simplification, i.e., the clauses in . These clauses are typically assigned a weight, with shorter clauses getting a larger weight. During our initial experiments for solving Schur Number Five, we observed that the clause-based heuristics is much more effective than the variable-based one. However, our initial experiments—based on splitting the problem into millions of subproblems and solving randomly selected subproblems—indicated that finding the solution of Schur Number Five would require many decades of CPU time.

The key to reducing the computational effort of solving Schur Number Five is a new measurement method. We first discuss the main weakness of the weighted-sum heuristics before we describe our new method. Recall that the Schur number encoding uses positive clauses of length and negative clauses of length . Thus, no matter on what literal we look ahead, most clauses in originate from negative clauses. Moreover, a clause in that originates from a negative clause has length , while a clause in that originates from a positive clause can be larger. Commonly used heuristics favor shorter clauses and thus favor clauses that originate from negative clauses. Because of this, the heuristic value of look-aheads is dominated by reduced negative clauses. However, it appears that favoring reduced positive clauses is more effective.

###### Example 2.

Recall , but now without redundant clauses:

Let us look ahead on literal : Assigning variable to false satisfies and reduces to , thereby forcing the variable to true. This in turn reduces the negative clause to . The only clause that is reduced, but not satisfied, is . Hence, looking ahead on yields .

We now present our generalization of an effective heuristic for uniform random 3-SAT instances [Li1999] to arbitrary CNFs. Given a literal and a formula , let denote the number of occurrences of in . The weight of a clause , denoted by , is computed as follows:

 w(F,C)=∑l∈Cocc(F,¯l)2|C|⋅|C|

The in the denominator reduces the sum to the average and ensures a larger weight for shorter clauses. We noticed that the sum works much better than the product for arbitrary CNFs, in contrast to random 3-SAT formulas [Dubois and Dequen2001]. The heuristic value of a variable w.r.t. a formula , denoted by , is computed as follows (with and referring to the formulas obtained by look-aheads on with the literals and , respectively):

 H(F,v)=(∑C∈F′∖Fw(F,C))⋅(∑C∈F′′∖Fw(F,C))

In each node of the search tree, the variable with the highest heuristic value is selected as splitting variable.

## Partitioning

A crucial part of solving Schur Number Five is the partitioning of the propositional formulas and into millions of easy subproblems. We use the former formula to compute all extreme certificates and the latter formula for the upper bound result. We constructed a single partition for both formulas. A partition is a set of cubes (or equivalently, a set of variable assignments). The disjunction of cubes in a partition must be a tautology in order to ensure that the cubes cover the entire search space. By applying a cube to a formula, one obtains a subproblem of that formula. Each of the subproblems arising from a partition can be solved in parallel, thereby allowing massively parallel computation. Moreover, these subproblems are partitioned again to solve them more efficiently (on a single core).

The top-level partition is constructed as follows: We use the look-ahead decision heuristic described above to build a binary search tree over the space of possible variable assignments. In this tree, every non-leaf node is assigned a splitting variable. The left outgoing edge of a node assigns its variable to true while the right one assigns it to false. Each node in the tree represents the variable assignment corresponding to all assignments on the path from the root node. In case the formula in a node (i.e., the formula obtained from the original formula by applying the assignment represented by that node) becomes “easy”, we stop splitting. The partition consists of all assignments that are represented by the leaf nodes of the tree. We require a measure that captures the hardness of a formula in each node. A rough measure suffices here, since we will fine-tune this partition later. We observed that the number of binary clauses in a formula is a reasonable measure for the hardness of Schur number subproblems. The more binary clauses, the more constrained the subproblem (and thus easier to solve). For example, stopping with the splitting as soon as a formula in a node has more than binary clauses results in about millions of mostly easy subproblems of and .

### The Hidden Strength of Cube-and-Conquer

Cube-and-conquer is not only useful for partitioning a hard problem into many subproblems that can be solved in parallel, but also to boost performance of solving a problem on a single core. Let be the number of cubes in a partition. A low value of indicates that the problem is split into a low number of subproblems, meaning that it is mainly solved with CDCL ( means pure CDCL) while a larger value indicates a more extensive splitting based on look-aheads.

If we experiment with different values for when trying to solve a problem on a single core, we can observe an interesting pattern: For low values of , an increase of leads to an increase of the total runtime—apparently some subproblems are about as hard as the original one. If we increase further, the total runtime starts to decrease and at some point it can even become significantly smaller compared to solving the problem with CDCL alone (again running both on a single core). Yet when becomes really large, the runtime increases again. At this point, splitting starts to dominate the total costs. Figure 2 shows this pattern on a subproblem of , where the optimal value for is around .

We developed a mechanism that approximates the optimal number to realize fast performance. The mechanism stops splitting if the number of remaining variables in a node drops below the value of parameter . The number of remaining variables is a useful measure as it strictly decreases, whereas number of binary clauses can oscillate. Initial experiments showed that such oscillation can slowdown the solver on easy problems. We initialize to , meaning that we keep splitting until is increased. We only increase when look-ahead techniques can refute a node, which naturally terminates splitting. In this case, is set to the number of remaining variables in that node. The increase is motivated as follows: If look-ahead techniques can refute a node, then we expect CDCL to refute that node—as well as similar nodes—more efficiently.

The value of is decreased in each node of the search tree to ensure that look-ahead techniques refute nodes once in a while. We experimented with various methods to implement the decrement and observed that the size of the decrease should be related to the depth of a node in the search tree. The closer a node is to the root, the larger the decrement. More specifically, we used the following update function (with referring to the depth of the node that performs the update and parameter referring to down exponent and parameter referring to the down factor):

 δ:=δ(1−fde)

If the value of is close to , then climbs to a value at which look-ahead techniques will rarely refute a node. On the other hand, if the value of is close to , then drops quickly to , so that practically all leaf nodes will be refuted by look-ahead techniques. The value of determines the influence of the depth. If is close to , then the depth is ignored during the update, while if is close to (or even larger), then the depth is dominant. During our experiments, various combinations of values of and resulted in strong performance. Examples of effective values are and , and , and and . For the final experiments we used and .

### Hardness Predictor and Partition Balancing

Most modern microprocessors used in clusters, including the Intel Xeon chips we used, have many CPU cores yet relatively little memory—at least for our application, which runs many SAT solvers and theorem provers in parallel. A major challenge was technical in nature: maximizing the CPU usage (with hyper-threading) without running out of memory.

Although most subproblems generated in the top-level partition could be solved within reasonable time (less than two minutes) some subproblems required hours of computation. Moreover, solving these hard subproblems required disproportionally more memory. Solving a few hard problems on the same chip at the same time could kill all threads. We therefore fine-tuned the initial partition.

We used the mechanism to partition subproblems as a hardness predictor of subproblems by using a high down exponent and a small down factor: and . Using a small down factor boosts the partitioning runtime, but produces typically too few cubes. However, here we only care about the runtime. It turned out that the runtime of partitioning with and is larger than a second for hard subproblems and significantly smaller for the easier ones. We extended the partition by splitting subproblems if this hardness predictor took over a second. Splitting was continued until none of the subproblems was predicted to be hard. To limit the size of the partition, we merged two cubes if they had the same parent node and the sum of their hardness predictor times was less than second.

### Solving Subproblems

Our top-level partition of and , denoted by , consists of cubes after partition balancing. Figure 3 (left) shows a histogram of the size of the cubes in . The smallest cube has size while the largest cube has length , showing that the binary tree associated with the cubes is quite unbalanced. Notice that the size of most cubes is in the range from to , which is a large interval.

Figure 3 (middle) shows a histogram of the number of binary clauses in subproblems, i.e., the resulting formulas after applying the cubes. Notice that the interval here is small: Most subproblems have between and binary clauses. As stated earlier, the number of binary clauses is a useful rough measure for the hardness of subproblems.

For each cube , we solved222The tools and proof parts presented in this paper are available at https://www.cs.utexas.edu/~marijn/Schur/. the problem using our cube-and-conquer solver consisting of a modified version of Marchcu [Heule et al.2012] as look-ahead (cube) solver and Glucose 3.0 [Audemard and Simon2009] as CDCL (conquer) solver. The cube solver modifications consist of integrating the presented decision heuristic and replacing the cutoff procedure by the presented down factor mechanism. In case was unsatisfiable, we stored the proof of unsatisfiability, which is also a proof of unsatisfiability of . There were only 961 cubes for which turned out to be satisfiable. For those cubes, we computed the proof of unsatisfiability of . These proofs together form the implication proof.

We solved the subproblems on the Lonestar cluster of the Texas Advanced Computing Center (TACC). Each compute node consists of a Xeon E5-2690 v3 (Haswell) chip with cores running on GHz. Hyper-threading was enabled, resulting in logical CPUs per node. We ran the experiments on nodes in parallel, resulting in running copies of our cube-and-conquer solver in parallel. The total runtime was roughly CPU hours for the partition phase and roughly CPU hours for the conquer phase. The total costs to compute Schur Number Five was just over CPU years, but less than three days in wall-clock time on the Lonestar cluster. Figure 3 (right) shows a histogram of the runtimes (rounding times to the nearest seconds). A large fraction of the subproblems can be solved within to seconds. Most subproblems are solvable within two minutes and few are somewhat harder. The subproblems were partitioned into a total of billion cubes and the number of conflict clauses added in the conquer phase was trillion.

The computation of Schur Number Five and the Pythagorean triples problem differ in the balance between the partition phase and the conquer phase. For Schur Number Five, almost of the computation was devoted to the conquer phase, while for the Pythagorean triples problem this was only

. This difference can be explained as follows: For both problems, a heuristic was chosen to continue splitting until the total runtime would start to increase (based on the solving time of randomly selected subproblems). In the case of Schur Number Five, this point is reached earlier. The Pythagorean triples problem was solved on the older Stampede cluster of TACC, which hinders a clean runtime comparison. We estimate that the conquer phase of solving Schur Number Five required about ten times more computation resources than solving the Pythagorean triples problem.

## No Backbone, but Backdoors

The backbone of a CNF formula is the set of literals that are assigned to true in all satisfying total assignments. Many formulas that encode the existence of extreme certificates of problems in Ramsey Theory [Graham, Rothschild, and Spencer1990], such as the van der Waerden numbers [Kouril and Paul2008] and the Pythagorean triples problem [Heule, Kullmann, and Marek2016], have large backbones after symmetry breaking—even if the number of satisfying total assignments is enormous. However, the backbone of consists only of the literals that are assigned by the symmetry-breaking predicates.

The lack of a substantial backbone suggests that there may exist a symmetry that is not broken. It turns out that palindromic Schur number problems have certificate symmetry that maps each number onto with being the size of the certificate and any number that is relatively prime to  [Fredricksen and Sweet2000]. However, is not a certificate symmetry of classic Schur number problems. The size of the backbone can therefore be explained by the equivalence and the certificate symmetry of palindromic Schur number problems.

Although the backbone of is small, we observed that there are several backdoors to large clusters of solutions. A backdoor of a CNF formula is a partial assignment such that can be solved efficiently (in polynomial time) using a given algorithm [Williams, Gomes, and Selman2003]. We selected subsumption elimination (SE) [Eén and Biere2005] followed by blocked clause elimination (BCE) [Järvisalo, Biere, and Heule2012] as the basis for backdoors of . Both SE and BCE are confluent and run in polynomial time. BCE removes blocked clauses until fixpoint. BCE solves a given formula if and only if the fixpoint is the empty formula, which is trivially satisfiable.

We computed the backdoors as follows: We started with the top-level cubes (assignments) under which is satisfiable. For each such assignment , we computed the backbone of . The backbone can be computed using various SAT calls: Literal belongs to the backbone if and only if is unsatisfiable. In most cases, the backbone assignment turned out to be a backdoor of . In the remaining cases, we extended the backbone assignments using look-aheads until they became backdoors. This procedure resulted in backdoors of , which—by construction—cover all satisfying assignments.

We used these backdoors to compute all extreme certificates with the sharpSAT solver [Thurley2006] in a few seconds. Out of these extreme certificates, are modular and of the modular ones, are palindromes. The latter number has been conjectured before [Fredricksen and Sweet2000].333The paper [Fredricksen and Sweet2000] states that the number of palindromic extreme certificates is . However, the described method produces such certificates.

## Correctness

We produced and certified a proof of unsatisfiability of to increase confidence in the correctness of our result. The proof consists of three parts: The re-encoding proof, the implication proof, and the tautology proof. The re-encoding proof shows the correctness of our symmetry-breaking technique, i.e., that the satisfiability of implies the satisfiability of the re-encoded formula . The implication proof includes for each cube , a proof of unsatisfiability of . This shows that implies each clause . The tautology proof shows that the disjunction of cubes is a tautology, i.e., the cubes together cover the entire search space. Let denote the negation of , i.e., the CNF formula that has each clause with . The disjunction of cubes is a tautology if and only if is unsatisfiable. We show unsatisfiability of via

 \mathrlapre-encoding proofF5161 ⊨wR5161 F5161 ⊨w\mathrlapR5161 ⊨ ¯¯¯¯¯¯¯P5 implication proofR5161 ⊨tautology proof ¯¯¯¯¯¯¯P5 ⊨⊥

The proof parts have been constructed in the DRAT format, which facilitates expressing techniques that remove satisfying assignments, such as symmetry breaking [Heule, Hunt Jr., and Wetzler2015]. Recent progress in verified proof checking [Cruz-Filipe, Marques-Silva, and Schneider-Kamp2017] reduced proof validation costs such that they became comparable to the costs of solving. We converted the DRAT proofs into LRAT proofs, a new format that was recently introduced to allow efficient proof checking using a theorem prover [Cruz-Filipe et al.2017]. We used a verified LRAT proof-checker [Heule et al.2017], written in the language of the ACL2 theorem proving system [Kaufmann and Moore1997], and applied it to certify these LRAT proofs.

Only the encoding of the Schur Number Five problem into propositional logic (i.e., the generation of ) was not checked using a theorem prover. We chose to skip verification of this part because the encoding can be implemented using a dozen lines of straightforward C code.

#### Re-encoding Proof.

The purpose of the re-encoding proof is to express our symmetry-breaking techniques—used for breaking the color symmetry —in the DRAT proof system. We did this using an existing method [Heule, Hunt Jr., and Wetzler2015]: For each clause in a symmetry-breaking predicate, the method adds a new definition stating that if the lexicographical ordering is violated, two colors are swapped. However, the unit clauses and cannot be added immediately as enforcing the lexicographical order for the first two colors requires multiple swaps. Instead of the unit clause , we learn , , , and which together imply . The case of is similar.

For example, the first unit clause in the re-encoding proof is , which is learned as follows. Any assignment that assigns to true, violates the lexicographical ordering. In particular it violates as to true forces to false via the optional clause . We add constraints to the formula stating that if is assigned to true, then every variable is swapped with and the other way around. Expressing this swap using DRAT steps requires introducing auxiliary variables. Afterwards the unit clause is implied.

The re-encoding proof is megabytes in size (uncompressed DRAT) and consists of almost a million clause addition steps and a similar number of clause deletion steps. That is reasonably large considering that it only breaks the color symmetry . However, compared to the implication proof (discussed below) the size is negligible.

#### Implication Proof.

We proved that is unsatisfiable by showing that there exists a formula, in our case , such that () every clause in the formula is logically implied by , and () the formula can be easily shown to be unsatisfiable. The implication proof includes, for each cube , a proof of unsatisfiability of . The size of the implication proof is petabytes in the compressed DRAT format produced by Glucose and petabytes in the compressed LRAT format produced by the DRAT-trim proof checker. The latter format is used by the formally verified checker. As a comparison, the proof of the Pythagorean triples problem is 200 terabytes in the uncompressed DRAT format [Heule, Kullmann, and Marek2016]. Lightweight proof compression shrinks DRAT proofs of Schur number problems to approximately of their size, while LRAT proofs are reduced to about of their size. DRAT proofs of Schur number problems have lots of small numbers, while LRAT proofs have large numbers. This causes the different effectiveness in proof compression. Based on the DRAT compression rate, the Schur Number Five proof is about ten times as large in the same format. Producing the compressed LRAT proof required almost CPU years while certifying it required another CPU years.

#### Tautology Proof.

The tautology proof describes that the disjunction of cubes is a tautology, i.e., that the cubes cover the entire search space. We showed this by proving that is unsatisfiable. The cubes produced by our partition method form a binary tree of assignments by construction. The tautology proof consists of resolution steps, each time resolving two clauses whose corresponding cubes have the same parent node in the binary tree. The size of formula is gigabyte and the size of the tautology proof is gigabytes in the uncompressed DRAT format.

### Certifying the Proof

The size of the proof demands a parallel certification approach and storing intermediate results. Below we describe our method, which uses widely used tools.

• : We provided the ACL2 theorem prover with the formulas , , and the re-encoding proof. After validating this proof, it returns the parsed formulas and and a verified statement that . Correctness of the parsing is checked using the Unix tool diff by comparing with and with .

• : We check that every clause is implied by . The theorem prover receives , , and a proof of unsatisfiability of . The theorem prover returns the parsed formula , parsed clause , and a statement that . Again diff is used to check the equivalence of the formulas and . Clause is stored for the next step.

• : We construct by concatenating all clauses implied by in the prior step, simply using the Unix tool cat. The theorem prover is provided with and a proof of its unsatisfiability, and proves that the parsed formula is unsatisfiable. The last check, again using diff, validates that equals the stored formula .

## Conclusions and Future Work

We proved that using massively parallel SAT solving. To achieve this result, we designed powerful look-ahead heuristics and developed a cheap hardness predictor to partition a hard problem into millions of manageable subproblems. These subproblems were solved using our cube-and-conquer solver. The resulting proof is over two petabytes in size in a compressed format. We certified the correctness of the proof using the ACL2 theorem proving system. Given the enormous size of the proof, we argue that any result produced by SAT solvers can now be validated using highly trustworthy systems with reasonable overhead.

A century after Issai Schur proved the existence of Schur numbers, we now know the value of the first five. Determining Schur number six will be extremely challenging and might be beyond any computational method. A more realistic problem is the computation of the fifth weak Schur number . Just a few years ago, it was shown that  [Eliahou et al.2012], while it has been conjectured since the 1950s that  [Walker1952]. This appears relatively close to the value of . However, we expect the corresponding propositional formula to be much harder to solve due to the lack of binary negative clauses in the encoding of weak Schur numbers.

## Acknowledgements

The author is supported by NSF under grant CCF-1526760 and by AFRL Award FA8750-15-2-0096. The author thanks Benjamin Kiesl, Jasmin Blanchette, Matt Kaufmann, Armin Biere, Victor Marek, Scott Aaronson, and the anonymous reviewers for their valuable input to improve the quality of the paper. The author acknowledges the Texas Advanced Computing Center (TACC) at the University of Texas at Austin for providing grid resources that have contributed to the research results reported within this paper.

## References

• [Abbott and Wang1977] Abbott, H. L., and Wang, E. T. H. 1977. Sum-free sets of integers. Proceedings of the American Mathematical Society 67:11–16.
• [Audemard and Simon2009] Audemard, G., and Simon, L. 2009. Predicting learnt clauses quality in modern SAT solvers. In

21st International Joint Conference on Artificial Intelligence

, 399–404.
• [Balyo, Heule, and Järvisalo2017] Balyo, T.; Heule, M. J. H.; and Järvisalo, M. 2017. SAT competition 2016: Recent developments. In AAAI 2017, 5061–5063.
• [Biere et al.2009] Biere, A.; Heule, M. J. H.; van Maaren, H.; and Walsh, T., eds. 2009. Handbook of Satisfiability, volume 185 of Frontiers in Artificial Intelligence and Applications. IOS Press.
• [Biere2013] Biere, A. 2013. Lingeling, Plingeling and Treengeling entering the SAT competition 2013. Proceedings of SAT Competition 2013  51.
• [Blanchard, Harary, and Reis2006] Blanchard, P. F.; Harary, F.; and Reis, R. 2006. Partitions into sum-free sets. Integers 6. #A07.
• [Crawford et al.1996] Crawford, J. M.; Ginsberg, M. L.; Luks, E. M.; and Roy, A. 1996. Symmetry-breaking predicates for search problems. In KR 1996, 148–159. Morgan Kaufmann.
• [Cruz-Filipe et al.2017] Cruz-Filipe, L.; Heule, M. J. H.; Hunt Jr., W. A.; Kaufmann, M.; and Schneider-Kamp, P. 2017. Efficient certified RAT verification. In CADE 26, 220–236. Springer.
• [Cruz-Filipe, Marques-Silva, and Schneider-Kamp2017] Cruz-Filipe, L.; Marques-Silva, J. P.; and Schneider-Kamp, P. 2017. Efficient certified resolution proof checking. In TACAS 2017, 118–135. Springer.
• [Dubois and Dequen2001] Dubois, O., and Dequen, G. 2001. A backbone-search heuristic for efficient solving of hard 3-SAT formulae. In 17th International Joint Conference on Artificial Intelligence – IJCAI’01, 248–253. Morgan Kaufmann.
• [Eén and Biere2005] Eén, N., and Biere, A. 2005. Effective preprocessing in SAT through variable and clause elimination. In 8th International Conference on Theory and Applications of Satisfiability Testing – SAT 2005, 61–75. Springer.
• [Eliahou et al.2012] Eliahou, S.; Marí n, J.; Revuelta, M.; and Sanz, M. 2012. Weak Schur numbers and the search for G.W. Walker’ s lost partitions. Computers & Mathematics with Applications 63(1):175 – 182.
• [Exoo1994] Exoo, G. 1994. A lower bound for Schur numbers and multicolor Ramsey numbers of . The Electronic Journal of Combinatorics 1. #R8.
• [Fredricksen and Sweet2000] Fredricksen, H., and Sweet, M. M. 2000. Symmetric sum-free partitions and lower bounds for Schur numbers. Electronic Journal of Combinatorics 7. #R32.
• [Golomb and Baumert1965] Golomb, S. W., and Baumert, L. D. 1965. Backtrack programming. Journal of the ACM 12(4):516–524.
• [Graham, Rothschild, and Spencer1990] Graham, R. L.; Rothschild, B. L.; and Spencer, J. H. 1990. Ramsey Theory, 2nd Edition. Wiley.
• [Greenwood and Gleason1955] Greenwood, R. E., and Gleason, A. M. 1955. Combinatorial relations and chromatic graphs. Canadian Journal of Mathematics 7:1–7.
• [Guy1994] Guy, R. K. 1994. Unsolved problems in number theory; 2nd ed. Problem Books in Mathematics Unsolved Problems in Intuitive Mathematics. New York: Springer.
• [Heule and van Maaren2009] Heule, M. J. H., and van Maaren, H. 2009. Look-Ahead Based SAT Solvers, Volume 185 of Biere et al. HandbookOfSAT2009. chapter 5, 155–184.
• [Heule et al.2012] Heule, M. J. H.; Kullmann, O.; Wieringa, S.; and Biere, A. 2012. Cube and conquer: Guiding CDCL SAT solvers by lookaheads. In 7th International Haifa Verification Conference – HVC 2011, 50–65. Springer.
• [Heule et al.2017] Heule, M. J. H.; Hunt Jr., W. A.; Kaufmann, M.; and Wetzler, N. D. 2017. Efficient, verified checking of propositional proofs. In ITP 2017, 269–284. Springer.
• [Heule, Hunt Jr., and Wetzler2015] Heule, M. J. H.; Hunt Jr., W. A.; and Wetzler, N. D. 2015. Expressing symmetry breaking in DRAT proofs. In CADE 25, 591–606. Springer.
• [Heule, Kullmann, and Marek2016] Heule, M. J. H.; Kullmann, O.; and Marek, V. W. 2016. Solving and verifying the Boolean Pythagorean Triples problem via Cube-and-Conquer. In Theory and Applications of Satisfiability Testing – SAT 2016, 228–245. Springer.
• [Irving1973] Irving, R. W. 1973. An extension of Schur’s theorem on sum-free partitions. Acta Arithmetica 25:55–64.
• [Järvisalo, Biere, and Heule2012] Järvisalo, M.; Biere, A.; and Heule, M. J. H. 2012. Simulating circuit-level simplifications on CNF. Journal of Automated Reasoning 49(4):583–619.
• [Järvisalo, Heule, and Biere2012] Järvisalo, M.; Heule, M. J. H.; and Biere, A. 2012. Inprocessing rules. In IJCAR 2012, 355–370. Springer.
• [Kaufmann and Moore1997] Kaufmann, M., and Moore, J S. 1997. An industrial strength theorem prover for a logic based on common lisp. IEEE Transactions on Software Engineering 23(4):203–213.
• [Konev and Lisitsa2015] Konev, B., and Lisitsa, A. 2015. Computer-aided proof of Erdős discrepancy properties. Artificial Intelligence 224(C):103–118.
• [Kouril and Paul2008] Kouril, M., and Paul, J. L. 2008. The van der Waerden number is . Experimental Mathematics 17(1):53–61.
• [Lamb2016] Lamb, E. 2016. Maths proof smashes size record: Supercomputer produces a 200-terabyte proof – but is it really mathematics? Nature 534:17–18.
• [Lammich2017] Lammich, P. 2017. Efficient verified (UN)SAT certificate checking. In CADE 26, 237–254. Springer.
• [Li1999] Li, C. M. 1999. A constraint-based approach to narrow search trees for satisfiability. Information processing letters 71(2):75–80.
• [Marques-Silva, Lynce, and Malik2009] Marques-Silva, J. P.; Lynce, I.; and Malik, S. 2009. Conflict-Driven Clause Learning SAT Solvers, Volume 185 of Biere et al. HandbookOfSAT2009. chapter 4, 131–153.
• [Schur1917] Schur, I. 1917. Über die Kongruenz . Jahresbericht der Deutschen Mathematikervereinigung 25:114–117.
• [Thurley2006] Thurley, M. 2006. sharpSAT - counting models with advanced component caching and implicit BCP. In SAT 2006, volume 4121 of LNCS, 424–429. Springer.
• [Walker1952] Walker, G. 1952. A problem in partitioning. American Mathematical Monthly 59:253.
• [Williams, Gomes, and Selman2003] Williams, R.; Gomes, C. P.; and Selman, B. 2003. Backdoors to typical case complexity. In 18th International Joint Conference on Artificial Intelligence, 1173–1178.