# Learning Nonlinear Loop Invariants with Gated Continuous Logic Networks

In many cases, verifying real-world programs requires inferring loop invariants with nonlinear constraints. This is especially true in programs that perform many numerical operations, such as control systems for avionics or industrial plants. Recently, data-driven methods for loop invariant inference have gained popularity, especially on linear loop invariants. However, applying data-driven inference to nonlinear invariants is challenging due to the large numbers of and large magnitudes of high-order terms, the potential for overfitting on samples, and the large space of possible nonlinear inequality bounds. In this paper, we introduce a new neural architecture for general SMT learning, the Gated Continuous Logic Network (G-CLN), and apply it to nonlinear loop invariant learning. G-CLNs extend the Continuous Logic Network architecture with gating units and dropout, which allow the model to robustly learn general invariants over large numbers of terms. To address overfitting that arises from finite program sampling, we introduce fractional sampling—a sound relaxation of loop semantics to continuous functions that facilitates unbounded sampling on the real domain. We also design a new CLN activation function, the Piecewise Biased Quadratic Unit (PBQU), for naturally learning tight inequality bounds. We incorporate these methods into a nonlinear loop invariant inference system that can learn general nonlinear loop invariants. We evaluate our system on a benchmark of nonlinear loop invariants and show it solves 26 out of 27 problems, 3 more than prior work, with an average runtime of 53.3 seconds. We further demonstrate the generic learning ability of G-CLNs by solving all 124 problems in the linear Code2Inv benchmark. We also perform a quantitative stability evaluation and show G-CLNs have a convergence rate of 97.5% on quadratic problems, a 39.2% improvement over CLN models.

## Authors

• 5 publications
• 5 publications
• 7 publications
• 31 publications
• 5 publications
• ### Learning Nonlinear Loop Invariants with Gated Continuous Logic Networks (Extended Version)

Verifying real-world programs often requires inferring loop invariants w...
03/17/2020 ∙ by Jianan Yao, et al. ∙ 0

• ### CLN2INV: Learning Loop Invariants with Continuous Logic Networks

Program verification offers a framework for ensuring program correctness...
09/25/2019 ∙ by Gabriel Ryan, et al. ∙ 0

• ### Data-Driven Invariant Learning for Probabilistic Programs

Morgan and McIver's weakest pre-expectation framework is one of the most...
06/09/2021 ∙ by Jialu Bao, et al. ∙ 0

• ### OASIS: ILP-Guided Synthesis of Loop Invariants

Finding appropriate inductive loop invariants for a program is a key cha...
11/26/2019 ∙ by Sahil Bhatia, et al. ∙ 0

• ### Automated Generation of Non-Linear Loop Invariants Utilizing Hypergeometric Sequences

Analyzing and reasoning about safety properties of software systems beco...
05/08/2017 ∙ by Andreas Humenberger, et al. ∙ 0

• ### Cuvée: Blending SMT-LIB with Programs and Weakest Preconditions

Cuvée is a program verification tool that reads SMT-LIB-like input files...
10/10/2020 ∙ by Gidon Ernst, et al. ∙ 0

• ### Templates and Recurrences: Better Together

This paper is the confluence of two streams of ideas in the literature o...
03/30/2020 ∙ by Jason Breck, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Proof for Theorem 3.1

Property 1. .

Theorem 3.1. For a gated CLN model with input nodes and output node , if all gating parameters are either 0 or 1, then using the formula extraction algorithm, the recovered SMT formula is equivalent to the gated CLN model . That is, ,

 (1) F(x)=True⟺limc1→0,c2→∞σ→0,ϵ→0M(x;c1,c2,σ,ϵ)=1 (2) F(x)=False⟺limc1→0,c2→∞σ→0,ϵ→0M(x;c1,c2,σ,ϵ)=0

as long as the t-norm in satisfies Property 1.

###### Proof.

We prove this by induction.

T-norm Case. If which means the final operation in is a gated t-norm, we know that for each submodel the gating parameters are all either 0 or 1. According to the induction hypothesis, for each , using Algorithm 1, we can extract an equivalent SMT formula satisfying Eq. (1)(2). Now we prove the full model and the extracted formula satisfy Eq. (1). The proof for Eq. (2) is similar and omitted for brevity. For simplicity we use to denote .

Now the proof goal becomes , and the induction hypothesis is

 (3) ∀i∈{1,...,n}, Fi(x)=True⟺Mi(x)=1

From line 2-5 in Algorithm 1, we know that the extracted formula is the conjunction of a subset of formulas if the is activated (). Because in our setting all gating parameters are either 0 or 1, then a logical equivalent form of can be derived.

 (4) F(x)=∧ni=1((gi=0)∨Fi(x))

Recall we are considering the t-norm case where

 M(x)=TG(M1(x),...,Mn(x);g1,...,gn)=

Using the properties of a t-norm in §2.2 we can prove that

 M(x)=1⟺∀i∈{1,...,n}, (1+gi(Mi(x)−1))=1

Because the gating parameter is either 0 or 1, we further have

 (5) M(x)=1⟺∀i∈{1,...,n}, gi=0∨Mi=1

Combining Eq. (3)(4)(5), we will finally have

 F(x)=True⟺∀i∈{1,...,n}, (gi=0)∨Fi(x)=True ⟺∀i∈{1,...,n}, (gi=0)∨Mi(x)=1 ⟺M(x)=1

T-conorm Case. If which means the final operation in is a gated t-conorm, similar to the t-norm case, for each submodel we can extract its equivalent SMT formula using Algorithm 1 according to the induction hypothesis. Again we just prove the full model and the extracted formula satisfy Eq. (1).

From line 7-10 in Algorithm 1, we know that the extracted formula is the disjunction of a subset of formulas if the is activated (). Because in our setting all gating parameters are either 0 or 1, then a logical equivalent form of can be derived.

 (6) F(x)=∨ni=1((gi=1)∧Fi(x))

Under the t-conorm case, we have

 M(x)=T′G(M1(x),...,Mn(x);g1,...,gn)= 1−(1−g1M1(x))⊗...⊗(1−gnMn(x))

Using Property 1 we can prove that

 M(x)=1⟺∃i∈{1,...,n}, 1−giMi(x)=0

Because the gating parameter is either 0 or 1, we further have

 (7) M(x)=1⟺∃i∈{1,...,n},gi=1∧Mi=1

Combining Eq. (3)(6)(7), we will finally have

 F(x)=True⟺∃i∈{1,...,n}, (gi=1)∧Fi(x)=True ⟺∃i∈{1,...,n}, (gi=1)∧Mi(x)=1 ⟺M(x)=1

Negation Case. If which means the final operation is a negation, from the induction hypothesis we know that using Algorithm 1 we can extract an SMT formula from submodel satisfying Eq. (1)(2). From line 11-12 in Algorithm 1 we know that the extracted formula for is . Now we show such an satisfy Eq. (1)(2).

 F(x)=True⟺F1(x)=False⟺M1(x)=0 ⟺M(x)=1 F(x)=False⟺F1(x)=True⟺M1(x)=1 ⟺M(x)=0

Atomic Case. In this case, the model consists of only a linear layer and an activation function, with no logical connectiveness. This degenerates to the atomic case for the ungated CLN in (Ryan et al., 2020) where the proof can simply be reused. ∎

## 2. Proof for Theorem 3.2

Recall our continuous mapping for inequalities.

 (8) S(t≥u)≜⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩c21(t−u)2+c21t

Theorem 3.2. Given a set of -dimensional samples with the maximum L2-norm , if and , and the weights are constrained as , then when the model converges, the learned inequality has distance at most from a ’desired’ inequality.

###### Proof.

The model maximize the overall continuous truth value, so the reward function is

 R(w1,...,wk,b)=n∑i=1f(w1xi1+...+wkxik+b)

where . We first want to prove that, if there exists a point that breaks the inequality with distance more than , then , indicating the model will not converge here. Without loss of generality, we assume the first point breaks the inequality, i.e.,

 (9) w1xi1+...+wkxik+b+C1/√3<0

We will consider the following two cases.

() All the points breaks the inequality, i.e.

 ∀i∈{1,...,n},w1xi1+...+wkxik+b<0

From Eq. (8), it is easy to see that

 (10) {f′(x)>0x<0f′(x)<0x>0

Then we have

 ∂R∂b=n∑i=1f′(w1xi1+...+wkxik+b)>n∑i=10=0

() At least one point, say, , satisfies the inequality.

 (11) w1xj1+...+wkxjk+b≥0

From Cauchy–Schwarz inequality, we have

 (w1xi1+...+wkxik)2≤(w21+...+w2k)(x2i1+...+x2ik) =1⋅(x2i1+...+x2ik)≤l2

So

 (12) ∀i∈{1,...,n}, −l≤w1xi1+...+wkxik≤l

Combining Eq. (9)(11)(12), a lower bound and an upper bound of can be obtained

 (13) −l≤b

Using basic calculus we can show that is strictly increasing in and , and strictly decreasing in . Combining Eq. (10)(12)(13), we have

 (14) f′(w1xi1+...+wkxik+b)≥f′(|w1xi1+...+wkxik+b|)≥f′(|w1xi1+...+wkxik|+|b|)≥f′(l+l)=f′(2l)

The deduction of Eq. (14) requires , which can be obtained from the two known conditions and .

Eq. (14) provides a lower bound of the derivative for any point. For the point that breaks the inequality in Eq. (9), since is strictly increasing in , we can obtain a stronger lower bound

 (15) f′(w1x11+...+wkx1k+b)≥f′(−l−l)=f′(−2l)

Put it altogether, we have

 ∂R∂b=f′(w1x11+...+wkx1k+b)+ n∑i=2f′(w1xi1+...+wkxik+b) ≥f′(−2l)+(n−1)f′(2l)>f′(−2l)+nf′(2l) >4lc21(1+4l2c21)2(1+4l2c22)2⋅(1−64nl4c21c22)

Some intermediate steps are omitted. Because we know , finally we have .

Now we have proved that no point can break the learned inequality more than . We need to prove at least one point is on or beyond the boundary. We prove this by contradiction. Suppose all points satisfy

 w1xi1+...+wkxik+b>0

Then we have

 ∂R∂b=n∑i=1f′(w1xi1+...+wkxik+b)

So the model does not converge, which concludes the proof. ∎

## References

• G. Ryan, J. Wong, J. Yao, R. Gu, and S. Jana (2020) CLN2INV: learning loop invariants with continuous logic networks. In International Conference on Learning Representations, External Links: Link Cited by: §1.