New Properties of the Data Distillation Method When Working With Tabular Data

10/19/2020
by   Dmitry Medvedev, et al.
0

Data distillation is the problem of reducing the volume oftraining data while keeping only the necessary information. With thispaper, we deeper explore the new data distillation algorithm, previouslydesigned for image data. Our experiments with tabular data show thatthe model trained on distilled samples can outperform the model trainedon the original dataset. One of the problems of the considered algorithmis that produced data has poor generalization on models with differenthyperparameters. We show that using multiple architectures during distillation can help overcome this problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2019

Improving Dataset Distillation

Dataset distillation is a method for reducing dataset sizes: the goal is...
research
05/28/2023

Distill Gold from Massive Ores: Efficient Dataset Distillation via Critical Samples Selection

Data-efficient learning has drawn significant attention, especially give...
research
11/11/2015

Unifying distillation and privileged information

Distillation (Hinton et al., 2015) and privileged information (Vapnik & ...
research
04/12/2021

Generalization bounds via distillation

This paper theoretically investigates the following empirical phenomenon...
research
05/17/2019

Dream Distillation: A Data-Independent Model Compression Framework

Model compression is eminently suited for deploying deep learning on IoT...
research
03/16/2022

Learning to Generate Synthetic Training Data using Gradient Matching and Implicit Differentiation

Using huge training datasets can be costly and inconvenient. This articl...
research
04/10/2022

Reducing Model Jitter: Stable Re-training of Semantic Parsers in Production Environments

Retraining modern deep learning systems can lead to variations in model ...

Please sign up or login with your details

Forgot password? Click here to reset