G-Mix: A Generalized Mixup Learning Framework Towards Flat Minima

08/07/2023
by   Xingyu Li, et al.
0

Deep neural networks (DNNs) have demonstrated promising results in various complex tasks. However, current DNNs encounter challenges with over-parameterization, especially when there is limited training data available. To enhance the generalization capability of DNNs, the Mixup technique has gained popularity. Nevertheless, it still produces suboptimal outcomes. Inspired by the successful Sharpness-Aware Minimization (SAM) approach, which establishes a connection between the sharpness of the training loss landscape and model generalization, we propose a new learning framework called Generalized-Mixup, which combines the strengths of Mixup and SAM for training DNN models. The theoretical analysis provided demonstrates how the developed G-Mix framework enhances generalization. Additionally, to further optimize DNN performance with the G-Mix framework, we introduce two novel algorithms: Binary G-Mix and Decomposed G-Mix. These algorithms partition the training data into two subsets based on the sharpness-sensitivity of each example to address the issue of "manifold intrusion" in Mixup. Both theoretical explanations and experimental results reveal that the proposed BG-Mix and DG-Mix algorithms further enhance model generalization across multiple datasets and models, achieving state-of-the-art performance.

READ FULL TEXT

page 2

page 5

page 7

page 12

page 13

page 15

research
08/03/2023

Feature Noise Boosts DNN Generalization under Label Noise

The presence of label noise in the training data has a profound impact o...
research
02/02/2020

SQWA: Stochastic Quantized Weight Averaging for Improving the Generalization Capability of Low-Precision Deep Neural Networks

Designing a deep neural network (DNN) with good generalization capabilit...
research
05/30/2021

Embedding Principle of Loss Landscape of Deep Neural Networks

Understanding the structure of loss landscape of deep neural networks (D...
research
05/19/2019

A type of generalization error induced by initialization in deep neural networks

How different initializations and loss functions affect the learning of ...
research
10/03/2020

Sharpness-Aware Minimization for Efficiently Improving Generalization

In today's heavily overparameterized models, the value of the training l...
research
10/07/2021

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Overparametrized Deep Neural Networks (DNNs) often achieve astounding pe...
research
08/13/2019

Learning Credible Deep Neural Networks with Rationale Regularization

Recent explainability related studies have shown that state-of-the-art D...

Please sign up or login with your details

Forgot password? Click here to reset