Evaluating and Crafting Datasets Effective for Deep Learning With Data Maps

08/22/2022
by   Jay Bishnu, et al.
1

Rapid development in deep learning model construction has prompted an increased need for appropriate training data. The popularity of large datasets - sometimes known as "big data" - has diverted attention from assessing their quality. Training on large datasets often requires excessive system resources and an infeasible amount of time. Furthermore, the supervised machine learning process has yet to be fully automated: for supervised learning, large datasets require more time for manually labeling samples. We propose a method of curating smaller datasets with comparable out-of-distribution model accuracy after an initial training session using an appropriate distribution of samples classified by how difficult it is for a model to learn from them.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/19/2017

Deep Active Learning for Named Entity Recognition

Deep neural networks have advanced the state of the art in named entity ...
research
05/10/2018

Scaling associative classification for very large datasets

Supervised learning algorithms are nowadays successfully scaling up to d...
research
07/26/2017

Context-Independent Polyphonic Piano Onset Transcription with an Infinite Training Dataset

Many of the recent approaches to polyphonic piano note onset transcripti...
research
11/30/2018

Large Datasets, Bias and Model Oriented Optimal Design of Experiments

We review recent literature that proposes to adapt ideas from classical ...
research
06/03/2021

NODE-GAM: Neural Generalized Additive Model for Interpretable Deep Learning

Deployment of machine learning models in real high-risk settings (e.g. h...
research
10/29/2018

Unsupervised Data Selection for Supervised Learning

Recent research put a big effort in the development of deep learning arc...
research
03/17/2022

Learning Distributionally Robust Models at Scale via Composite Optimization

To train machine learning models that are robust to distribution shifts ...

Please sign up or login with your details

Forgot password? Click here to reset