On Certifying Robust Models by Polyhedral Envelope
Certifying neural networks enables one to offer guarantees on a model's robustness. In this work, we use linear approximation to obtain an upper and lower bound of the model's output when the input data is perturbed within a predefined adversarial budget. This allows us to bound the adversary-free region in the data neighborhood by a polyhedral envelope, and calculate robustness guarantees based on this geometric approximation. Compared with existing methods, our approach gives a finer-grain quantitative evaluation of a model's robustness. Therefore, the certification method can not only obtain better certified bounds than the state-of-the-art techniques given the same adversarial budget but also derives a faster search scheme for the optimal adversarial budget. Furthermore, we introduce a simple regularization scheme based on our method that enables us to effectively train robust models.
READ FULL TEXT