Improved Distribution Matching for Dataset Condensation

07/19/2023
by   Ganlong Zhao, et al.
0

Dataset Condensation aims to condense a large dataset into a smaller one while maintaining its ability to train a well-performing model, thus reducing the storage cost and training effort in deep learning applications. However, conventional dataset condensation methods are optimization-oriented and condense the dataset by performing gradient or parameter matching during model optimization, which is computationally intensive even on small datasets and models. In this paper, we propose a novel dataset condensation method based on distribution matching, which is more efficient and promising. Specifically, we identify two important shortcomings of naive distribution matching (i.e., imbalanced feature numbers and unvalidated embeddings for distance computation) and address them with three novel techniques (i.e., partitioning and expansion augmentation, efficient and enriched model sampling, and class-aware distribution regularization). Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources, thereby scaling data condensation to larger datasets and models. Extensive experiments demonstrate the effectiveness of our method. Codes are available at https://github.com/uitrbn/IDM

READ FULL TEXT

page 4

page 12

page 13

page 14

research
03/21/2022

ViM: Out-Of-Distribution with Virtual-logit Matching

Most of the existing Out-Of-Distribution (OOD) detection algorithms depe...
research
11/18/2022

How to train your draGAN: A task oriented solution to imbalanced classification

The long-standing challenge of building effective classification models ...
research
01/06/2021

Scalable Feature Matching Across Large Data Collections

This paper is concerned with matching feature vectors in a one-to-one fa...
research
07/30/2022

Delving into Effective Gradient Matching for Dataset Condensation

As deep learning models and datasets rapidly scale up, network training ...
research
09/23/2022

Weakly Supervised Two-Stage Training Scheme for Deep Video Fight Detection Model

Fight detection in videos is an emerging deep learning application with ...
research
10/08/2021

Dataset Condensation with Distribution Matching

Computational cost to train state-of-the-art deep models in many learnin...
research
06/15/2022

Condensing Graphs via One-Step Gradient Matching

As training deep learning models on large dataset takes a lot of time an...

Please sign up or login with your details

Forgot password? Click here to reset