MixUp as Locally Linear Out-Of-Manifold Regularization
MixUp, a data augmentation approach through mixing random samples, has been shown to be able to significantly improve the predictive accuracy of the current art of deep neural networks. The power of MixUp, however, is mostly established empirically and its working and effectiveness have not been explained in any depth. In this paper, we develop a theoretical understanding for MixUp as a form of out-of-manifold regularization, which constrains the model on the input space beyond the data manifold. This analytical study also enables us to identify MixUp's limitation caused by manifold intrusion, where synthetic samples collide with real examples of the manifold. Such intrusion gives rise to over regularization and thereby under-fitting. To address this issue, we further propose a novel regularizer, where mixing policies are adaptively learned from the data and a manifold intrusion loss is embraced as to avoid collision with the data manifold. We empirically show, using several benchmark datasets, our regularizer's effectiveness in terms of over regularization avoiding and accuracy improvement upon current art of deep classification models and MixUp.
READ FULL TEXT