Adversarial Attacks on Neural Network Policies

02/08/2017
by   Sandy Huang, et al.
0

Machine learning classifiers are known to be vulnerable to inputs maliciously constructed by adversaries to force misclassification. Such adversarial examples have been extensively studied in the context of computer vision applications. In this work, we show adversarial attacks are also effective when targeting neural network policies in reinforcement learning. Specifically, we show existing adversarial example crafting techniques can be used to significantly degrade test-time performance of trained policies. Our threat model considers adversaries capable of introducing small perturbations to the raw input of the policy. We characterize the degree of vulnerability across tasks and training algorithms, for a subclass of adversarial-example attacks in white-box and black-box settings. Regardless of the learned task or training algorithm, we observe a significant drop in performance, even with small adversarial perturbations that do not interfere with human perception. Videos are available at http://rll.berkeley.edu/adversarial.

READ FULL TEXT

page 2

page 8

research
05/20/2022

Adversarial joint attacks on legged robots

We address adversarial attacks on the actuators at the joints of legged ...
research
06/27/2018

Gradient Similarity: An Explainable Approach to Detect Adversarial Attacks against Deep Learning

Deep neural networks are susceptible to small-but-specific adversarial p...
research
12/20/2019

secml: A Python Library for Secure and Explainable Machine Learning

We present secml, an open-source Python library for secure and explainab...
research
07/31/2019

Optimal Attacks on Reinforcement Learning Policies

Control policies, trained using the Deep Reinforcement Learning, have be...
research
12/19/2016

Simple Black-Box Adversarial Perturbations for Deep Networks

Deep neural networks are powerful and popular learning models that achie...
research
03/26/2021

Combating Adversaries with Anti-Adversaries

Deep neural networks are vulnerable to small input perturbations known a...

Please sign up or login with your details

Forgot password? Click here to reset