DeepAI AI Chat
Log In Sign Up

Data Sampling Affects the Complexity of Online SGD over Dependent Data

by   Shaocong Ma, et al.

Conventional machine learning applications typically assume that data samples are independently and identically distributed (i.i.d.). However, practical scenarios often involve a data-generating process that produces highly dependent data samples, which are known to heavily bias the stochastic optimization process and slow down the convergence of learning. In this paper, we conduct a fundamental study on how different stochastic data sampling schemes affect the sample complexity of online stochastic gradient descent (SGD) over highly dependent data. Specifically, with a ϕ-mixing model of data dependence, we show that online SGD with proper periodic data-subsampling achieves an improved sample complexity over the standard online SGD in the full spectrum of the data dependence level. Interestingly, even subsampling a subset of data samples can accelerate the convergence of online SGD over highly dependent data. Moreover, we show that online SGD with mini-batch sampling can further substantially improve the sample complexity over online SGD with periodic data-subsampling over highly dependent data. Numerical experiments validate our theoretical results.


page 1

page 2

page 3

page 4


Accelerating Minibatch Stochastic Gradient Descent using Typicality Sampling

Machine learning, especially deep neural networks, has been rapidly deve...

Online Learning to Sample

Stochastic Gradient Descent (SGD) is one of the most widely used techniq...

Learning Co-Sparse Analysis Operators with Separable Structures

In the co-sparse analysis model a set of filters is applied to a signal ...

Data Dependent Convergence for Distributed Stochastic Optimization

In this dissertation we propose alternative analysis of distributed stoc...

Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle

Although SGD with random reshuffle has been widely-used in machine learn...

Active Mini-Batch Sampling using Repulsive Point Processes

The convergence speed of stochastic gradient descent (SGD) can be improv...

Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems

Stochastic nested optimization, including stochastic compositional, min-...