Learning What Data to Learn

02/28/2017
by   Yang Fan, et al.
0

Machine learning is essentially the sciences of playing with data. An adaptive data selection strategy, enabling to dynamically choose different data at various training stages, can reach a more effective model in a more efficient way. In this paper, we propose a deep reinforcement learning framework, which we call Neural Data Filter (NDF), to explore automatic and adaptive data selection in the training process. In particular, NDF takes advantage of a deep neural network to adaptively select and filter important data instances from a sequential stream of training data, such that the future accumulative reward (e.g., the convergence speed) is maximized. In contrast to previous studies in data selection that is mainly based on heuristic strategies, NDF is quite generic and thus can be widely suitable for many machine learning tasks. Taking neural network training with stochastic gradient descent (SGD) as an example, comprehensive experiments with respect to various neural network modeling (e.g., multi-layer perceptron networks, convolutional neural networks and recurrent neural networks) and several applications (e.g., image classification and text understanding) demonstrate that NDF powered SGD can achieve comparable accuracy with standard SGD process by using less data and fewer iterations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2018

Learning to Teach

Teaching plays a very important role in our society, by spreading human ...
research
01/27/2017

Reinforced stochastic gradient descent for deep neural network learning

Stochastic gradient descent (SGD) is a standard optimization method to m...
research
07/25/2021

SGD May Never Escape Saddle Points

Stochastic gradient descent (SGD) has been deployed to solve highly non-...
research
07/16/2017

Normalized Gradient with Adaptive Stepsize Method for Deep Neural Network Training

In this paper, we propose a generic and simple algorithmic framework for...
research
01/02/2019

SGD Converges to Global Minimum in Deep Learning via Star-convex Path

Stochastic gradient descent (SGD) has been found to be surprisingly effe...
research
12/06/2018

Elastic Gossip: Distributing Neural Network Training Using Gossip-like Protocols

Distributing Neural Network training is of particular interest for sever...
research
11/22/2018

HyperAdam: A Learnable Task-Adaptive Adam for Network Training

Deep neural networks are traditionally trained using human-designed stoc...

Please sign up or login with your details

Forgot password? Click here to reset