Neural networks are a popular method for approximating nonlinear functions, with increasing applications in the field of human-robot interactions. For example, the kinematics of many elder-care robots [xiong2007development, ko2017neural], rehabilitation robots [xu2009adaptive, hussain2013adaptive], industrial robot manipulators [gribovskaya2011motion], and automated driving systems [tran2020nnv, shengbo2019key] are controlled by neural networks. Thus, verifying the safety of the neural networks in these systems, before deployment near humans, is crucial in avoiding injuries and accidents. However, it remains an active area of research to ensure the output of a neural network satisfies user-specified constraints and requirements. In this short paper, we take preliminary steps towards safety via constrained training by representing constraints as a collision check between the reachable set of a neural network and unsafe sets in its output space.
I-a Related Work
Many different solutions have been proposed for the verification problem, with set-based reachability analysis being the most common for an uncertain set of inputs [liu2019algorithms]. Depending on one’s choice of representation, the predicted output is either exact (e.g. star set [tran2019star, tran2020nnv], ImageStar [tran2020verification]) or an over-approximation (e.g. zonotope [althoff2010reachability]) of the actual output set. Reachability is most commonly computed layer-by-layer, though methods have been proposed that speed up verification by, e.g., using an anytime algorithm to return unsafe cells while enumerating polyhedral cells in the input space [vincent2020reachable], or recursively partitioning the input set via shadow prices [rubies2019fast].
Verification techniques have several drawbacks. First, they do not provide feedback about constraints during training, so one must alternate training and verification until desired properties have been achieved. Furthermore, verification by over-approximation can often be inconclusive, while exact verification can be expensive to compute.
Several alternative approaches have therefore been proposed. For example, [huang2015bidirectional] employs a constrained optimization layer to use the output of the network as a potential function for optimization while enforcing constraints. Similarly, [stewart2017label, xu2018semantic] adds a constraint violation penalty to the objective loss function and penalizes violation of the constraint. These methods augment their networks with constrained optimization, but are unable to guarantee constraint satisfaction upon convergence of the training. Alternatively, [cruz2021safe] uses a systematic process of small changes to conform a “mostly-correct” network to constraints. However the method only works for networks with a Two-Level Lattice (TLL) architecture, requires an already-trained network, and again does not guarantee a provably safe solution. Finally, [markolf2021polytopic] attempts to learn the optimal cost-to-go for the Hamilton–Jacobi–Bellman (HJB) equation, while subjected to constraints on the output of the neural network controller. Yet, it does not actually involve any network training and is unable to handle uncertain input sets.
Recently, constrained zonotopes have been introduced as a set-based representation that is closed under linear transformations and can exactly represent any convex polytope[scott2016constrained, raghuraman2020set]
. Importantly, these sets are well-suited for reachability analysis due to analytical, efficient methods for computing Minkowski sums, intersections, and collision checks; in particular, collision-checking only requires solving a linear program. We leverage these properties to enable our contributions.
We propose a method to compute the output of a neural network with rectified linear unit (ReLU) activations given an input set represented as constrained zonotopes. We then enforce performance by training under a differentiable zonotope intersection constraint, which guarantees safety upon convergence. Our method is demonstrated on a small numerical example, and illustrated in Fig. 1.
We now introduce our notation for neural networks and define constrained zonotopes.
In this work, we consider a fully-connected, ReLU-activated feedforward neural network , with output given an input . We call the input set. We denote by the depth of the network and by the width of the th layer. For each layer , the hidden state of the neural network is given by
where , and , and
where is a linear layer operation, and is the ReLU nonlinearity with the max taken elementwise. We do not apply the ReLU activation for the final output layer:
The reachable set of the neural network is
We represent the reachable set as a union of constrained zonotopes. A constrained zonotope is a set parameterized by a center , generator matrix , linear constraints , , and coefficients as follows:
Importantly, the intersection of constrained zonotopes is also a constrained zonotope [scott2016constrained, Proposition 1]. Let and . Then is given by
We leverage this property to evaluate constraints on the forward reachable set of our neural network.
In this section, we first explain how to pass a constrained zonotope exactly through a ReLU nonlinearity; that is, we compute the reachable set of a ReLU activation given a constrained zonotope as the input. We then discuss how to train a neural network using the reachable set to enforce constraints. Finally, we explain how to compute the gradient of the constraint for backpropagation.
Before proceeding, we briefly mention that we can pass an input constrained zonotope through a linear layer as
This follows from the definition in (6).
Iii-a Constrained Zonotope ReLU Activation
The ReLU activation of a constrained zonotope is:
where each output constrained zonotopes is given by:
where is a matrix containing the elementwise absolute value of and is the th combination of the possible -tuples defined over the set .
The formulation in (10) follows from treating the operation applied to all negative elements of the input zonotope as a sequence of two operations. First
, we intersect the input constrained zonotope with the halfspace defined by the vectorin the codomain of ; this is why the linear operator is applied to each and , as given by the analytical intersection of a constrained zonotope with a halfspace [raghuraman2020set, Eq. 10]. Second, we zero out the dimension corresponding to that halfspace/unit vector (i.e., project all negative points to zero). Since the max is taken elementwise, there are possible intersection/zeroings when considering each dimension as either activated or not. ∎
Per Proposition 1, passing a constrained zonotope through a ReLU nonlinearity produces a set of constrained zonotopes. A similar phenomenon is found in ReLU activations of other set representations [tran2020verification], with exponential growth in the computational time and memory required as a function of layer width and number of layers. To mitigate this growth, empty constrained zonotopes can be pruned after each activation, hence our next discussion.
Iii-B Constrained Zonotope Emptiness Check
To check if is empty, we solve a linear program (LP) [scott2016constrained, Proposition 2]:
Then, is empty if and only if . Importantly, by construction, as long as there exist feasible (for which ), then (III-B) is always feasible. Since the intersection of constrained zonotopes is also a constrained zonotope as in (7), we can use this emptiness check to enforce collision-avoidance (i.e., non-intersection) constraints. This is the basis of our constrained training method.
Iii-C Constrained Neural Network Training
The main goal of this paper is constrained neural network training. For robotics in particular, as future work, our goal is to train a robust controller. In this work, we consider an unsafe output set which could represent, e.g., actuator limits or obstacles in a robot’s workspace (in which case the output of the neural network is passed through a robot’s dynamics).
Iii-C1 Generic Formulation
Consider an input set represented by a constrained zonotope, an unsafe set , a training dataset , , of training examples and labels , and an objective loss function . Let be the collection of all of the neural network weights and all the biases. We formulate the training problem as:
where is the reachable set as in (5). We write the loss as a function of all of the input/output data (as opposed to batching the data) for ease of presentation.
Iii-C2 Set and Constraint Representations
We represent the input set and unsafe set as constrained zonotopes, and . Similarly, it follows from Proposition 1 that the output set can be exactly represented as a union of constrained zonotopes:
where depends on the layer widths and network depth.
Recall that is a constrained zonotope as in (7). So, to compute the constraint loss, we evaluate by solving (III-B) for each constrained zonotope with . Then, denoting as the output of (III-B) for each , we represent the constraint as a function for which
which is negative when feasible as is standard in constrained optimization [nocedal2006numerical]. Using (14), we ensure the neural network obeys constraints by checking for each .
Iii-D Differentiating the Collision Check Loss
To train using backpropagation, we must differentiate the constraint loss . This means we must compute the gradient of (III-B) with respect to the problem parameters and , which are defined by the centers, generators, and constraints of the output constrained zonotope set. To do so, we leverage techniques from [amos2017optnet], which can be applied because (III-B) is always feasible.
Consider the Lagrangian of (III-B):
where is the dual variable for the inequality constraint and is the dual variable for the equality constraint. For any optimizer , the optimality conditions are
where we have used the fact that . Taking the differential (denoted by ) of (16), we get
We can then solve (17) for the Jacobian of with respect to any entry of the zonotope centers or generators by setting the right-hand side appropriately (see [amos2017optnet] for details). That is, we can now differentiate (14) with respect to the elements of , , , or . In practice, we differentiate (III-B) automatically using the cvxpylayers library [agrawal2019differentiable].
Iv Numerical Example
We test our method by training a 2-layer feedforward ReLU network with input dimension 2, hidden layer size of 10, and output dimension of 2. We chose this network with only one ReLU nonlinearity layer, as recent results have shown that a shallow ReLU network performs similarly to a deep ReLU network with the same amount of neurons[hanin2019deep]. However, note that our method (in particular Proposition 1) does generalize to deeper networks. We pose this preliminary example as a first effort towards this novel style of training.
Problem Setup. We seek to approximate the function
with . We create an unsafe set in the output space as
where we use for concision in place of the training data.
Implementation. We implemented our method111Our code is available online: https://github.com/Stanford-NavLab/constrained-nn-training
in PyTorch[paszke2019pytorch] with optim.SGD as our optimizer on a desktop computer with 6 cores, 32 GB RAM, and an RTX 2060 GPU.
We trained the network for iterations with and without the constraints enforced. To enforce hard constraints as in (12), in each iteration, we compute the objective loss function across the entire dataset, then backpropagate the objective gradient; then, we compute the constraint loss as in (14) for all active constraints, then backpropagate. For future work we will apply more sophisticated constrained optimization techniques (e.g., an active set method) [nocedal2006numerical, Ch. 15].
With our current naïve implementation, constrained training took approximately 5 hours, whereas the unconstrained training took 0.5 s. Our method is slower due to the need to compute an exponentially-growing number of constrained zonotopes as in Proposition 1. However, we notice that the GPU utilization is only 1-5% (the reachability propagation is not fully parallelized), indicating significant room for increased parallelization and speed.
Results and Discussion. Results for unconstrained and constrained training are shown in Fig. 3 and Fig. 4. Our proposed method avoids the unsafe set. Note the output constrained zonotopes (computed for both networks) contain the colored output points, verifying our exact set representation in Proposition 1.
Table I shows results for unconstrained and constrained training; importantly, our method obeys the constraints. As expected for nonlinear constrained optimization, the network converged to a local minimum while obeying the constraints. The key challenge is that the constrained training is several orders of magnitude slower than unconstrained training. We plan to address in future work by increased parallelization, by pruning of our reachable sets [tran2020nnv], and by using anytime verification techniques [vincent2020reachable].
|final objective loss||0.0039||0.0127|
|final constraint loss||0.0575||0.0000|
V Conclusion and Future Work
This work proposes a constrained training method for feedforward ReLU neural networks. We demonstrated the method successfully on a small example of nonlinear function approximation. Given the ability to enforce output constraints, the technique can potentially be applied to offline training for safety-critical neural networks.
Our current implementation has several drawbacks to be addressed in future work. First, the method suffers an exponential blowup of constrained zonotopes through a ReLU. We hope to improve the forward pass step by using techniques such as [vincent2020reachable]
instead of layer-by-layer evaluation to compute the output set, and by conservatively estimating the reachable set similar to[rubies2019fast]. We also plan to apply the method on larger networks, such as for autonomous driving in [tran2020nnv] or the ACAS Xu network [kochenderfer2011robust, kochenderfer2012next, kochenderfer2015optimized] for aircraft collision avoidance. In general, our goal is to train robust controllers where the output of a neural network must obey actuator limits and obstacle avoidance (for which the network output is passed through dynamics).