Data Sketching for Faster Training of Machine Learning Models

06/05/2019
by   Baharan Mirzasoleiman, et al.
8

Many machine learning problems reduce to the problem of minimizing an expected risk, defined as the sum of a large number of, often convex, component functions. Iterative gradient methods are popular techniques for the above problems. However, they are in general slow to converge, in particular for large data sets. In this work, we develop analysis for selecting a subset (or sketch) of training data points with their corresponding learning rates in order to provide faster convergence to a close neighbordhood of the optimal solution. We show that subsets that minimize the upper-bound on the estimation error of the full gradient, maximize a submodular facility location function. As a result, by greedily maximizing the facility location function we obtain subsets that yield faster convergence to a close neighborhood of the optimum solution. We demonstrate the real-world effectiveness of our algorithm, SIG, confirming our analysis, through an extensive set of experiments on several applications, including logistic regression and training neural networks. We also include a method that provides a deliberate deterministic ordering of the data subset that is quite effective in practice. We observe that our method, while achieving practically the same loss, speeds up gradient methods by up to 10x for convex and 3x for non-convex (deep) functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/28/2022

Adaptive Second Order Coresets for Data-efficient Machine Learning

Training machine learning models on massive datasets incurs substantial ...
research
02/18/2014

Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning

Majorization-minimization algorithms consist of successively minimizing ...
research
05/31/2021

Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective

Accelerated gradient-based methods are being extensively used for solvin...
research
07/20/2023

Investigating minimizing the training set fill distance in machine learning regression

Many machine learning regression methods leverage large datasets for tra...
research
04/16/2016

DS-MLR: Exploiting Double Separability for Scaling up Distributed Multinomial Logistic Regression

Scaling multinomial logistic regression to datasets with very large numb...
research
06/28/2023

Ordering for Non-Replacement SGD

One approach for reducing run time and improving efficiency of machine l...
research
02/21/2021

A Sketching Method for Finding the Closest Point on a Convex Hull

We develop a sketching algorithm to find the point on the convex hull of...

Please sign up or login with your details

Forgot password? Click here to reset