DiM: Distilling Dataset into Generative Model

03/08/2023
by   Kai Wang, et al.
0

Dataset distillation reduces the network training cost by synthesizing small and informative datasets from large-scale ones. Despite the success of the recent dataset distillation algorithms, three drawbacks still limit their wider application: i). the synthetic images perform poorly on large architectures; ii). they need to be re-optimized when the distillation ratio changes; iii). the limited diversity restricts the performance when the distillation ratio is large. In this paper, we propose a novel distillation scheme to Distill information of large train sets into generative Models, named DiM. Specifically, DiM learns to use a generative model to store the information of the target dataset. During the distillation phase, we minimize the differences in logits predicted by a models pool between real and generated images. At the deployment stage, the generative model synthesizes various training samples from random noises on the fly. Due to the simple yet effective designs, the trained DiM can be directly applied to different distillation ratios and large architectures without extra cost. We validate the proposed DiM across 4 datasets and achieve state-of-the-art results on all of them. To the best of our knowledge, we are the first to achieve higher accuracy on complex architectures than simple ones, such as 75.1% with ResNet-18 and 72.6% with ConvNet-3 on ten images per class of CIFAR-10. Besides, DiM outperforms previous methods with 10% ∼ 22% when images per class are 1 and 10 on the SVHN dataset.

READ FULL TEXT

page 12

page 13

research
05/02/2023

Generalizing Dataset Distillation via Deep Generative Prior

Dataset Distillation aims to distill an entire dataset's knowledge into ...
research
12/10/2020

Large-Scale Generative Data-Free Distillation

Knowledge distillation is one of the most popular and effective techniqu...
research
07/27/2021

Dataset Distillation with Infinitely Wide Convolutional Networks

The effectiveness of machine learning algorithms arises from being able ...
research
06/02/2023

Privacy Distillation: Reducing Re-identification Risk of Multimodal Diffusion Models

Knowledge distillation in neural networks refers to compressing a large ...
research
02/28/2023

DREAM: Efficient Dataset Distillation by Representative Matching

Dataset distillation aims to generate small datasets with little informa...
research
03/22/2022

Dataset Distillation by Matching Training Trajectories

Dataset distillation is the task of synthesizing a small dataset such th...
research
11/20/2022

Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation

Model-based deep learning has achieved astounding successes due in part ...

Please sign up or login with your details

Forgot password? Click here to reset