Defending Adversarial Attacks without Adversarial Attacks in Deep Reinforcement Learning

08/14/2020
by   Xinghua Qu, et al.
8

Many recent studies in deep reinforcement learning (DRL) have proposed to boost adversarial robustness through policy distillation utilizing adversarial training, where additional adversarial examples are added in the training process of the student policy; this makes the robustness improvement less flexible and more computationally expensive. In contrast, we propose an efficient policy distillation paradigm called robust policy distillation that is capable of achieving an adversarially robust student policy without relying on any adversarial example during student policy training. To this end, we devise a new policy distillation loss that consists of two terms: 1) a prescription gap maximization loss aiming at simultaneously maximizing the likelihood of the action selected by the teacher policy and the entropy over the remaining actions; 2) a Jacobian regularization loss that minimizes the magnitude of Jacobian with respect to the input state. The theoretical analysis proves that our distillation loss guarantees to increase the prescription gap and the adversarial robustness. Meanwhile, experiments on five Atari games firmly verifies the superiority of our policy distillation on boosting adversarial robustness compared to other state-of-the-arts.

READ FULL TEXT

page 8

page 13

research
12/29/2019

Real-time Policy Distillation in Deep Reinforcement Learning

Policy distillation in deep reinforcement learning provides an effective...
research
06/28/2023

Mitigating the Accuracy-Robustness Trade-off via Multi-Teacher Adversarial Distillation

Adversarial training is a practical approach for improving the robustnes...
research
11/03/2019

Online Robustness Training for Deep Reinforcement Learning

In deep reinforcement learning (RL), adversarial attacks can trick an ag...
research
02/06/2019

Distilling Policy Distillation

The transfer of knowledge from one policy to another is an important too...
research
12/14/2020

Achieving Adversarial Robustness Requires An Active Teacher

A new understanding of adversarial examples and adversarial robustness i...
research
05/18/2022

Policy Distillation with Selective Input Gradient Regularization for Efficient Interpretability

Although deep Reinforcement Learning (RL) has proven successful in a wid...
research
10/25/2022

Accelerating Certified Robustness Training via Knowledge Transfer

Training deep neural network classifiers that are certifiably robust aga...

Please sign up or login with your details

Forgot password? Click here to reset