Explainable Learning: Implicit Generative Modelling during Training for Adversarial Robustness

07/05/2018

∙

We introduce Explainable Learning ,ExL, an approach for training neural networks that are intrinsically robust to adversarial attacks. We find that the implicit generative modelling of random noise, during posterior maximization, improves a model's understanding of the data manifold furthering adversarial robustness. We prove our approach's efficacy and provide a simplistic visualization tool for understanding adversarial data, using Principal Component Analysis. Our analysis reveals that adversarial robustness, in general, manifests in models with higher variance along the high-ranked principal components. We show that models learnt with ExL perform remarkably well against a wide-range of black-box attacks.

READ FULL TEXT

Explainable Learning: Implicit Generative Modelling during Training for Adversarial Robustness

Sign in with Google

Consider DeepAI Pro