Verification of Recurrent Neural Networks Through Rule Extraction

by   Qinglong Wang, et al.

The verification problem for neural networks is verifying whether a neural network will suffer from adversarial samples, or approximating the maximal allowed scale of adversarial perturbation that can be endured. While most prior work contributes to verifying feed-forward networks, little has been explored for verifying recurrent networks. This is due to the existence of a more rigorous constraint on the perturbation space for sequential data, and the lack of a proper metric for measuring the perturbation. In this work, we address these challenges by proposing a metric which measures the distance between strings, and use deterministic finite automata (DFA) to represent a rigorous oracle which examines if the generated adversarial samples violate certain constraints on a perturbation. More specifically, we empirically show that certain recurrent networks allow relatively stable DFA extraction. As such, DFAs extracted from these recurrent networks can serve as a surrogate oracle for when the ground truth DFA is unknown. We apply our verification mechanism to several widely used recurrent networks on a set of the Tomita grammars. The results demonstrate that only a few models remain robust against adversarial samples. In addition, we show that for grammars with different levels of complexity, there is also a difference in the difficulty of robust learning of these grammars.


Verifying Recurrent Neural Networks using Invariant Inference

Deep neural networks are revolutionizing the way complex systems are dev...

Property-Directed Verification of Recurrent Neural Networks

This paper presents a property-directed approach to verifying recurrent ...

Crafting Adversarial Input Sequences for Recurrent Neural Networks

Machine learning models are frequently used to solve complex security pr...

On The Vulnerability of Recurrent Neural Networks to Membership Inference Attacks

We study the privacy implications of deploying recurrent neural networks...

Connecting First and Second Order Recurrent Networks with Deterministic Finite Automata

We propose an approach that connects recurrent networks with different o...

Universal Perturbation Attack on Differentiable No-Reference Image- and Video-Quality Metrics

Universal adversarial perturbation attacks are widely used to analyze im...

Please sign up or login with your details

Forgot password? Click here to reset