Hold me tight! Influence of discriminative features on deep network boundaries

02/15/2020 ∙ by Guillermo Ortiz-Jiménez, et al. ∙ 4

Important insights towards the explainability of neural networks and their properties reside in the formation of their decision boundaries. In this work, we borrow tools from the field of adversarial robustness and propose a new framework that permits to relate the features of the dataset with the distance of data samples to the decision boundary along specific directions. We demonstrate that the inductive bias of deep learning has the tendency to generate classification functions that are invariant along non-discriminative directions of the dataset. More surprisingly, we further show that training on small perturbations of the data samples are sufficient to completely change the decision boundary. This is actually the characteristic exploited by the so-called adversarial training to produce robust classifiers. Our general framework can be used to reveal the effect of specific dataset features on the macroscopic properties of deep models and to develop a better understanding of the successes and limitations of deep learning.



There are no comments yet.


page 9

page 10

page 14

page 22

page 23

page 24

page 27

page 30

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.