ARDIR: Improving Robustness using Knowledge Distillation of Internal Representation

11/01/2022
by   Tomokatsu Takahashi, et al.
0

Adversarial training is the most promising method for learning robust models against adversarial examples. A recent study has shown that knowledge distillation between the same architectures is effective in improving the performance of adversarial training. Exploiting knowledge distillation is a new approach to improve adversarial training and has attracted much attention. However, its performance is still insufficient. Therefore, we propose Adversarial Robust Distillation with Internal Representation (ARDIR) to utilize knowledge distillation even more effectively. In addition to the output of the teacher model, ARDIR uses the internal representation of the teacher model as a label for adversarial training. This enables the student model to be trained with richer, more informative labels. As a result, ARDIR can learn more robust student models. We show that ARDIR outperforms previous methods in our experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2019

Adversarially Robust Distillation

Knowledge distillation is effective for producing small high-performance...
research
03/10/2022

Improving Neural ODEs via Knowledge Distillation

Neural Ordinary Differential Equations (Neural ODEs) construct the conti...
research
05/21/2022

Mapping Emulation for Knowledge Distillation

This paper formalizes the source-blind knowledge distillation problem th...
research
06/21/2018

Gradient Adversarial Training of Neural Networks

We propose gradient adversarial training, an auxiliary deep learning fra...
research
08/19/2022

DAFT: Distilling Adversarially Fine-tuned Models for Better OOD Generalization

We consider the problem of OOD generalization, where the goal is to trai...
research
10/16/2019

A Generalized and Robust Method Towards Practical Gaze Estimation on Smart Phone

Gaze estimation for ordinary smart phone, e.g. estimating where the user...
research
03/14/2022

On the benefits of knowledge distillation for adversarial robustness

Knowledge distillation is normally used to compress a big network, or te...

Please sign up or login with your details

Forgot password? Click here to reset