Stochastic Gradient Descent Meets Distribution Regression

10/24/2020
by   Nicole Mücke, et al.
0

Stochastic gradient descent (SGD) provides a simple and efficient way to solve a broad range of machine learning problems. Here, we focus on distribution regression (DR), involving two stages of sampling: Firstly, we regress from probability measures to real-valued responses. Secondly, we sample bags from these distributions for utilizing them to solve the overall regression problem. Recently, DR has been tackled by applying kernel ridge regression and the learning properties of this approach are well understood. However, nothing is known about the learning properties of SGD for two stage sampling problems. We fill this gap and provide theoretical guarantees for the performance of SGD for DR. Our bounds are optimal in a mini-max sense under standard assumptions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/21/2021

Robust Kernel-based Distribution Regression

Regularization schemes for regression have been widely studied in learni...
research
08/29/2022

DR-DSGD: A Distributionally Robust Decentralized Learning Algorithm over Graphs

In this paper, we propose to solve a regularized distributionally robust...
research
02/01/2017

On SGD's Failure in Practice: Characterizing and Overcoming Stalling

Stochastic Gradient Descent (SGD) is widely used in machine learning pro...
research
08/15/2023

Max-affine regression via first-order methods

We consider regression of a max-affine model that produces a piecewise l...
research
07/11/2013

Fast gradient descent for drifting least squares regression, with application to bandits

Online learning algorithms require to often recompute least squares regr...
research
06/18/2020

Stochastic Gradient Descent in Hilbert Scales: Smoothness, Preconditioning and Earlier Stopping

Stochastic Gradient Descent (SGD) has become the method of choice for so...
research
05/25/2018

Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes

We consider stochastic gradient descent (SGD) for least-squares regressi...

Please sign up or login with your details

Forgot password? Click here to reset