The proliferation of Neural Networks (NNs) as safety-critical controllers has made obtaining provably correct NN controllers vitally important. However, most current techniques for doing so involve a repeatedly training and verifying a NN until adequate safety properties have been achieved. Such methods are not only inherently computationally expensive (because training and verification of NNs are), their convergence properties can be extremely poor. For example, when verifying multiple safety properties, such methods can cycle back and forth between safety properties, with each subsequent retraining achieving one safety property by undoing another one.
An alternative approach obtains safety-critical NN controllers by repairing an existing NN controller. Specifically, it is assumed that an already-trained NN controller is available that performs in a mostly correct fashion, albeit with some specific, known instances of incorrect behavior. But rather than using retraining techniques, repair entails systematically altering the parameters of the original controller in a limited way, so as to retain the original safe behavior while simultaneously correcting the unsafe behavior. The objective of repair is to exploit as much as possible the safety that was learned during the training of the original NN parameters, rather than allowing re-training to unlearn safe behavior.
Despite these advantages, the NN repair problem is challenging because it has two main objectives, both of which are at odds with each other. In particular, repairing an unsafe behavior requires altering the NN’s response in alocal
region of the state space, but changing even a few neurons generally affects theglobal response of the NN – which could undo the initial safety guarantee supplied with the network. This tension is especially relevant for general deep NNs, and repairs realized on neurons in their latter layers. This is especially the case for repairing controllers, where the relationship between specific neurons and their importance to the overall safety properties is difficult to discern. As a result, there has been limited success in studying NN controller repair, especially for nonlinear systems.
In this paper, we exhibit an explicit algorithm that can repair a NN controller for a discrete-time, input-affine nonlinear system. The cornerstone of our approach is to consider NN controllers of a specific architecture: in particular, the recently proposed Two-Level Lattice (TLL) NN architecture . The TLL architecture has unique neuronal semantics, and those semantics greatly facilitate finding a balance between the local and global trade-offs inherent in NN repair. In particular, by assuming a TLL architecture, we can separate the problem of controller repair into two significantly decoupled problems, one consisting of essentially only local considerations and one consisting of essentially only global ones.
: Repairing (or patching) NNs can be traced to the late 2000s. An early result on patching connected transfer learning and concept drift with patching
; another result established fundamental requirements to apply classifier patching on NNs by using inner layers to learn a patch for concept drift in an image classifier network. Another approach based on a Satisfiability Modulo Theory (SMT) formulation of the repair problem was proposed by 
where they changed the parameters of a classifier network to comply with a safety specification, i.e. where the designer knows exactly the subset of the input space to be classified. This prior work nonetheless is heuristic-based and so not guaranteed to produced desired results, which was noticed by who cast the problem of patching (minimal repair) as a verification problem for NNs (including Deep ones). However, this work focused on a restricted version of the problem in which the changes in weights are limited to a single layer. Finally,  proposed a verification-based approach for repairing DNNs but not restricted to modifying the output; instead, proposed to identify and modify the most relevant neurons that causes the safety violation using gradient guidance.
We will denote the real numbers by . For an
matrix (or vector),, we will use the notation to denote the element in the row and column of . Analogously, the notation will denote the row of , and will denote the column of ; when is a vector instead of a matrix, both notations will return a scalar corresponding to the corresponding element in the vector. Let be an matrix of zeros. We will use bold parenthesis to delineate the arguments to a function that returns a function. We use the functions and to return the first and last elements of an ordered list (or a vector in ). The function concatenates two ordered lists, or two vectors in and along their (common) nontrivial dimension to get a third vector in . Finally, denotes an open Euclidean ball centered at with radius . The norm will refer to the Euclidean norm.
Ii-B Dynamical Model
In this paper, we will consider the general case of a discrete-time input-affine nonlinear system specified by:
where is the state, is the input. In addition, and are continuous and smooth functions of .
Definition 1 (Closed-loop Trajectory).
Let . Then a closed-loop trajectory of the system (1) under , starting from state , will be denoted by the sequence . That is and .
Definition 2 (Workspace).
We will assume that trajectories of (1) are confined to a connected, compact workspace, with non-empty interior, of size .
Ii-C Neural Networks
We will exclusively consider Rectified Linear Unit Neural Networks (ReLU NNs). A -layer ReLU NN is specified by composing layer functions, each of which may be either linear and nonlinear. A nonlinear layer with inputs and outputs is specified by a real-valued matrix of weights, , and a real-valued matrix of biases, as follows: with the function taken element-wise, and . A linear layer is the same as a nonlinear layer, only it omits the nonlinearity ; such a layer will be indicated with a superscript lin, e.g. . Thus, a -layer ReLU NN function as above is specified by layer functions that are composable: i.e. they satisfy . We will annotate a ReLU function by a list of its parameters 111That is is not the concatenation of the into a single large matrix, so it preserves information about the sizes of the constituent ..
The number of layers and the dimensions of the matrices specify the architecture of the ReLU NN. Therefore, we will denote the architecture of the ReLU NN by
Ii-D Special NN Operations
Definition 3 (Sequential (Functional) Composition).
Let and be two NNs where and . Then the functional composition of and , i.e. , is a well defined NN, and can be represented by the parameter list .
Let and be two -layer NNs with parameter lists: . Then the parallel composition of and is a NN given by the parameter list
That is accepts an input of the same size as (both) and , but has as many outputs as and combined.
Definition 5 (-element / NNs).
An -element network is denoted by the parameter list . such that is the the minimum from among the components of (i.e. minimum according to the usual order relation on ). An -element network is denoted by , and functions analogously. These networks are described in .
Ii-E Two-Level-Lattice (TLL) Neural Networks
In this paper, we will be especially concerned with ReLU NNs that have the Two-Level Lattice (TLL) architecture, as introduced with the AReN algorithm in . Thus we define a TLL NN as follows.
Definition 6 (Tll Nn [1, Theorem 2]).
A NN that maps is said to be TLL NN of size if the size of its parameter list can be characterized entirely by integers and as follows.
each has the form ; and
for some sequence , where is the identity matrix.
The matrices will be referred to as the linear function matrices of . The matrices will be referred to as the selector matrices of . Each set is said to be the selector set of .
A multi-output TLL NN with range space is defined using equally sized scalar TLL NNs. That is we denote such a network by , with each output component denoted by , .
Iii Problem Formulation
The main problem we consider in this paper is one of TLL NN repair. In brief, we take as a starting point a TLL NN controller that is “mostly” correct in the sense that is provably safe under a specific set of circumstances (states); here we assume that safety entails avoiding a particular, fixed subset of the state space. However, we further suppose that this TLL NN controller induces some additional, unsafe behavior of (1) that is explicitly observed, such as from a more expansive application of a model checker; of course this unsafe behavior necessarily occurs in states not covered by the original safety guarantee. The repair problem, then, is to “repair” the given TLL controller so that this additional unsafe behavior is made safe, while simultaneously preserving the original safety guarantees associated with the network.
The basis for the problem in this paper is thus a TLL NN controller that has been designed (or trained) to control (1) in a safe way. In particular, we use the following definition to fix our notion of “unsafe” behavior for (1).
Definition 7 (Unsafe Operation of (1)).
Let be an real-valued matrix, and let be an real vector, which together define a set of unsafe states .
Then, we mean that a TLL NN controller is safe with respect to (1) and in the following sense.
Definition 8 (Safe TLL NN Controller).
Let be a set of states such that . Then a TLL NN controller is safe for (1) on horizon (with respect to and ) if:
That is is safe (w.r.t. ) if all of its length- trajectories starting in avoid the unsafe states .
The design of safe controllers in the sense of Definition 8 has been considered in a number of contexts; see e.g. . Often this design procedure involves training the NN using data collected from an expert, and verifying the result using one of many available NN verifiers .
However, as noted above, we further suppose that a given TLL NN which is safe in the sense of Definition 8 nevertheless has some unsafe behavior for states that lie outside . In particular, we suppose that a model checker (for example) provides to us a counterexample (or witness) to unsafe operation of (1).
Definition 9 (Counterexample to Safe Operation of (1)).
We can now state the main problem of this paper.
Let dynamics (1) be given, and assume its trajectories are confined to compact subset of states, (see Definition 2). Also, let be a specified set of unsafe states for (1), as in Definition 7. Furthermore, let be a TLL NN controller for (1) that is safe on horizon with respect to a set of states (see Definition 8), and let be a counterexample to safety in the sense of Definition 9.
Then the TLL repair problem is to obtain a new TLL controller with the following properties:
is also safe on horizon with respect to ;
the trajectory is safe – i.e. the counterexample is “repaired”;
and share a common architecture (as implied by their identical architectural parameters); and
the selector matrices of and are identical – i.e. for ; and
In particular, iii), iv) and v) justify the designation of this problem as one of “repair”. That is the repair problem is to fix the counterexample while keeping the network as close as possible to the original network under consideration. Note: the formulation of Problem 1 only allows repair by means of altering the linear layers of ; c.f. (iii) and (iv).
The TLL NN repair problem described in Problem 1 is challenging because it has two main objectives, which are at odds with each other. In particular, repairing a counterexample requires altering the NN’s response in a local region of the state space, but changing even a few neurons generally affects the global response of the NN – which could undo the initial safety guarantee supplied with the network. This tension is especially relevant for general deep NNs, and repairs realized on neurons in their latter layers. It is for this reason that we posed Problem 1 in terms of TLL NNs: our approach will be to use the unique semantics of TLL NNs to balance the trade-offs between local NN alteration to repair the defective controller and global NN alteration to ensure that the repaired controller activates at the counterexample. Moreover, locally repairing the defective controller at entails a further trade off between two competing objectives of its own: actually repairing the counterexample – Problem 1(ii) – without causing a violation of the original safety guarantee for – i.e. Problem 1(i). Likewise, global alteration of the TLL to ensure correct activation of our repairs will entail its own trade-off: the alterations necessary to achieve the correct activation will also have to be made without sacrificing the safety guarantee for – i.e. Problem 1(i).
We devote the remainder of this section to two crucial subsections, one for each side of this local/global dichotomy. Our goal in these two subsections is to describe constraints on a TLL controller that are sufficient to ensure that it accomplishes the repair described in Problem 1. Thus, the results in this section should be seen as optimization constraints around which we can build our algorithm to solve Problem 1. The algorithmic details and formalism are presented in Section V.
Iv-a Local TLL Repair
We first consider in isolation the problem of repairing the TLL controller in the vicinity of the counterexample , but under the assumption that the altered controller will remain the active there. The problem of actually guaranteeing that this is the case will be considered in the subsequent section. Thus, we proceed with the repair by establishing constraints on the alterations of those parameters in the TLL controller associated with the affine controller instantiated at and around the state . To be consistent with the literature, we will refer to any individual affine function instantiated by a NN as one of its local linear functions.
Definition 10 (Local Linear Function).
Let be CPWA. Then a local linear function of is a linear function if there exists an open set such that for all .
The unique semantics of TLL NNs makes them especially well suited to this local repair task because in a TLL NN, its local linear functions appear directly as neuronal parameters. In particular, all of the local linear functions of a TLL NN are described directly by parameters in its linear layer; i.e. for scalar TLL NNs or for the output of a multi-output TLL (see Definition 6). This follows as a corollary of the following relatively straightforward proposition, borrowed from :
Proposition 1 ([8, Proposition 3]).
Let be a scalar TLL NN with linear function matrices . Then every local linear function of is exactly equal to for some .
Similarly, let be a multi-output TLL, and let be any local linear function of . Then for each , the component of satisfies for some .
Let be a TLL over domain , and let . Then there exist integers for and a closed, connected set with non-empty interior, such that
on the set .
Corollary 1 is actually a strong statement: it indicates that in a TLL, each local linear function is described directly by its own linear-function-layer parameters and those parameters describe only that local linear function.
identify which of the local linear functions is realized by the TLL controller at – i.e. identifying the indices of the active local linear function at viz. indices for each output as in Corollary 1;
establish constraints on the parameters of that local linear function so as to ensure repair of the counterexample; i.e. altering the elements of the rows and for each output such that the resulting linear controller repairs the counterexample as in Problem 1(ii); and
establish constraints to ensure the repaired parameters do not induce a violation of the safety constraint for the guaranteed set of safe states, , as in Problem 1(i).
We consider these three steps in sequence as follows.
Iv-A1 Identifying the Active Controller at
From Corollary 1, all of the possible linear controllers that a TLL controller realizes are exposed directly in the parameters of its linear layer matrices, . Crucially for the repair problem, once the active controller at has been identified, the TLL parameters responsible for that controller immediately evident. This is the starting point for our repair process.
Since a TLL consists of two levels of lattice operations, it is straightforward to identify which of these affine functions is in fact active at ; for a given output, , this is can be done by evaluating and comparing the components thereof according to the selector sets associated with the TLL controller. That is the index of the active controller for output , denoted by , is determined by the following two expressions:
These expressions mirror the computations that define a TLL network, as described in Definition 6; the only difference is that and are replaced by and , respectively, so as to retrieve the index of interest instead of the network’s output.
Iv-A2 Repairing the Affine Controller at
Given the result of Corollary 1, the parameters of the network that result in a problematic controller at are readily apparent. Moreover, since these parameters are obviously in the linear layer of the original TLL, they are alterable under the requirement in Problem 1 that only linear-layer parameters are permitted to be used for repair. Thus, in the current context, local repair entails simply correcting the elements of the matrices and . It is thus clear that a “repaired” controller should satisfy
Then (8) represents a linear constraint in the local controller to be repaired, and this constraint imposes the repair property in Problem 1(ii). That is provided that the repaired controller described by remains active at the counterexample; as noted, we consider this problem in the global stasis condition subsequently.
Iv-A3 Preserving the Initial Safety Condition with the Repaired Controller
One unique aspect of the TLL NN architecture is that affine functions defined in its linear layer can be reused across regions of its input space. In particular, the controller associated with the parameters we repaired in the previous step – i.e. the indices of the linear layer matrices – may likewise be activated in or around . The fact that we altered these controller parameters thus means that trajectories emanating from may be affected in turn by our repair efforts: that is the repairs we made to the controller to address Problem 1(ii) may simultaneously alter the TLL in a way that undoes the requirement in Problem 1(i) – i.e. the initial safety guarantee on and . Thus, local repair of the problematic controller must account for this safety property, too.
We accomplish this by bounding the reach set of (1) for initial conditions in , and for this we employ the usual strategy of bounding the relevant Lipschitz constants. Naturally, since the TLL controller is a CPWA controller operated in closed loop, these bounds will also incorporate the size of the TLL controller parameters and for and .
In general, however, we have the following proposition.
Consider system dynamics (1), and suppose that the state is confined to known compact workspace, (see Definition 2). Also, let be the integer time horizon from Definition 8. Finally, assume that a closed-loop CPWA is applied to (1), and that has local linear functions .
Moreover, define the function as
and in turn define
Finally, define the function as in (11),
and in turn define
Then for all , , we have:
Proposition 2 bounds the size of the reach set for (1) in terms of an arbitrary CPWA controller, , when the system is started from . This proposition is naturally applied in order to find bounds for safety with respect to the unsafe region as follows.
Let , , and be as in Proposition 2, and let and be two constants s.t. for all
If and , then trajectories of (1) under closed loop controller are safe in the sense that
In particular, Proposition 3 states that if we find constants and that satisfy (14), then we have a way to bound the parameters of any CPWA controller (via and ) so that that controller is safe in closed loop. This translates to conditions that our repaired controller must satisfy in order to preserve the safety property required in Problem 1(i).
Now, let be the TLL controller as given in Problem 1, and let be its linear layer matrices for outputs as usual. For this controller, define the following two quantities:
so that and . Finally, let indices specify the active local linear functions of that are to be repaired, as described in Subsection IV-A1 and IV-A2. Let and be any repaired values of and , respectively.
If the following four conditions are satisfied
then the following hold for all :
The conclusion (22) of Corollary 2 should be interpreted as follows: the bound on the reach set of the repaired controller, , is no worse than the bound on the reach set of the original TLL controller given in Problem 1. Hence, by the assumptions borrowed from Proposition 3, conclusion (23) of Corollary 2 indicates that the repaired controller remains safe in the sense of Problem 1(i) – i.e. closed-loop trajectories emanating from remain safe on horizon .
Iv-B Global TLL Alteration for Repaired Controller Activation
In the context of local repair, we identified the local linear function instantiated by the TLL controller, and repaired the parameters associated with that particular function – i.e. the repairs were affected on a particular, indexed row of and . We then proceeded under the assumption that the affine function at that index would remain active in the output of the TLL network at the counterexample, even after altering its parameters. Unfortunately, this is not case in a TLL network per se, since the value of each local linear function at a point interacts with the selector matrices (see Definition 6) to determine whether it is active or not. In other words, changing the parameters of a particular indexed local linear function in a TLL will change its output value at any given point (in general), and hence also the region on which said indexed local linear function is active. Analogous to the local alteration consider before, we thus need to devise global constraints sufficient to enforce the activation of the repaired controller at .
This observation is manifest in the computation structure that defines a TLL NN: a particular affine function is active in the output of the TLL if and only if it is active in the output of one of the networks (see Definition 6), and the output of that same network exceeds the output of all others, thereby being active at the output of the final network (again, see Definition 6). Thus, ensuring that a particular, indexed local linear function is active at the output of a TLL entails ensuring that that function
appears at the output of one of the networks; and
appears at the output of the network, by exceeding the outputs of all the other networks.
Notably, this sequence also suggests a mechanism for meeting the task at hand: ensuring that the repaired controller remains active at the counter example.
Formally, we have the following proposition.
Let be a TLL NN over with output-component linear function matrices as usual, and let .
Then the index denote the local linear function that is active at for output , as described in Corollary 1, if and only if there exists index such that
for all and any ,
i.e. the active local linear function “survives” the network associated with selector set ; and
for all there exists an index s.t. for all
i.e. the active local linear function “survives” the network of output by exceeding the output of all of the other networks.
The “only if” portion of Proposition 4 thus directly suggests constraints to impose such that the desired local linear function is active on its respective output. In particular, among the non-active local linear functions at , at least one must be altered from each of the selector sets . The fact that these alterations must be made to local linear functions which are not active at the counterexample warrants the description of this procedure as “global alteration”.
V Main Algorithm
first, local alteration to ensure repair of the defective controller at ; and
subsequently, global alteration to ensure that the repaired local controller is activated at and around .
The derivations of both sets of constraints implies that they are merely sufficient conditions for their respective purposes, so there is no guarantee that any subset of them are jointly feasible. Moreover, as a “repair” problem, any repairs conducted must involve minimal alteration – Problem 1(v).
Thus, the core of our algorithm is to employ a convex solver to find the minimally altered TLL parameters that also satisfy the local and global constraints we have outlined for successful repair with respect to the other aspects of Problem 1. The fact that the local repair constraints are prerequisite to the global activation constraints means that we will employ a convex solver on two optimization problems in sequence: first, to determine the feasibility of local repair and effectuate that repair in a minimal way; and then subsequently to determine the feasibility of activating said repaired controller as required and effectuating that activation in a minimal way.
V-a Optimization Problem for Local Alteration (Repair)
Local alteration for repair starts by identifying the active controller at the counterexample, as denoted by the index for each output of the controller, . The local controller for each output is thus the starting point for repair in our algorithm, as described in the prequel. From this knowledge, an explicit constraint sufficient to repair the local controller at is specified directly by the dynamics: see (8).
Our formulation of a safety constraint for the locally repaired controller requires additional input, though. In particular, we need to identify constants and such that the non-local controllers satisfy (19) and (21). Then Corollary 2 implies that (18) and (20) are constraints that ensure the repaired controller satisfies Problem 1(i). For this we take the naive approach of setting , and then solving for the smallest that ensures safety for that particular . In particular, we set
V-B Optimization Problem for Global Alteration (Activation)
If the optimization problem Local is feasible, then the local controller at can successfully be repaired, and the global activation of said controller can be considered. Since we are starting with a local linear function we want to be active at and around , we can retain the definition of from the initialization of Local. Moreover, since Problem 1 preserves the selector matrices of the original TLL controller, we will define the selector indices, , in terms of the activation pattern of the original, defective local linear controller (although this is not required by the repair choices we have made: other choices are possible).
Thus, in order to formulate an optimization problem for global alteration, we need to define constraints compatible with Proposition 4 based on the activation/selector indices described above. Part (i) of the conditions in Proposition 4 is unambiguous at this point: it says that the desired active local linear function, , must have the minimum output from among those functions selected by selector set . Part (ii) of the conditions in Proposition 4 is ambiguous however: we only need to specify one local linear function from each of the other min groups to be “forced” lower than the desired active local linear function. In the face of this ambiguity, we select these functions using indices that are defined as follows:
That is we form our global alteration constraint out of the non-active controllers which are have the lowest outputs among their respective min groups. We reason that these local linear functions will in some sense require the least alteration in order to satisfy Part (ii) of Proposition 4, which requires their outputs to be less than the local linear function that we have just repaired.
Thus, we can formulate the global alteration optimization problem as follows:
where of course and are the repaired local controller parameters obtained from the optimal solution of Local. Note that the first two sets of equality constraints merely ensure that Global does not alter these parameters.
V-C Main Algorithm
A pseudo-code description of our main algorithm is shown in Algorithm 1, as repairTLL. It collects all of the initializations from Section IV, Subsection V-A and Subsection V-B. Only the functions FindActCntrl and FindActSlctr encapsulate procedures defined in this paper; their implementation is nevertheless adequately described in Subsection IV-A1 and Proposition 4, respectively. The correctness of repairTLL follows from the results in those sections.