Peter L. Bartlett

research

∙ 06/16/2023

Trained Transformers Learn Linear Models In-Context

Attention-based neural networks such as transformers have demonstrated a...

0 Ruiqi Zhang, et al. ∙

research

∙ 04/21/2023

Prediction, Learning, Uniform Convergence, and Scale-sensitive Dimensions

We present a new general-purpose algorithm for learning classes of [0,1]...

0 Peter L. Bartlett, et al. ∙

research

∙ 03/02/2023

Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization

Linear classifiers and leaky ReLU networks trained by gradient flow on t...

0 Spencer Frei, et al. ∙

research

∙ 03/02/2023

The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks

In this work, we study the implications of the implicit bias of gradient...

0 Spencer Frei, et al. ∙

research

∙ 10/04/2022

The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima

We consider Sharpness-Aware Minimization (SAM), a gradient-based optimiz...

0 Peter L. Bartlett, et al. ∙

research

∙ 09/26/2022

Off-policy estimation of linear functionals: Non-asymptotic theory for semi-parametric efficiency

The problem of estimating a linear functional based on observational dat...

0 Wenlong Mou, et al. ∙

research

∙ 02/15/2022

Random Feature Amplification: Feature Learning and Generalization in Neural Networks

In this work, we provide a characterization of the feature-learning proc...

0 Spencer Frei, et al. ∙

research

∙ 02/11/2022

Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data

Benign overfitting, the phenomenon where interpolating models generalize...

0 Spencer Frei, et al. ∙

research

∙ 01/21/2022

Optimal variance-reduced stochastic approximation in Banach spaces

We study the problem of estimating the fixed point of a contractive oper...

8 Wenlong Mou, et al. ∙

research

∙ 12/23/2021

Optimal and instance-dependent guarantees for Markovian linear stochastic approximation

We study stochastic approximation procedures for approximately solving a...

0 Wenlong Mou, et al. ∙

research

∙ 08/25/2021

The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks

The recent success of neural network models has shone light on a rather ...

0 Niladri S. Chatterji, et al. ∙

research

∙ 06/23/2021

Adversarial Examples in Multi-Layer Random ReLU Networks

We consider the phenomenon of adversarial examples in ReLU networks with...

12 Peter L. Bartlett, et al. ∙

research

∙ 05/29/2021

On the Theory of Reinforcement Learning with Once-per-Episode Feedback

We study a theory of reinforcement learning (RL) in which the learner re...

11 Niladri S. Chatterji, et al. ∙

research

∙ 05/05/2021

Preference learning along multiple criteria: A game-theoretic perspective

The literature on ranking from ordinal data is vast, and there are sever...

0 Kush Bhatia, et al. ∙

research

∙ 04/17/2021

Agnostic learning with unknown utilities

Traditional learning approaches for classification implicitly assume tha...

0 Kush Bhatia, et al. ∙

research

∙ 03/17/2021

Infinite-Horizon Offline Reinforcement Learning with Linear Function Approximation: Curse of Dimensionality and Algorithm

In this paper, we investigate the sample complexity of policy evaluation...

41 Lin Chen, et al. ∙

research

∙ 03/16/2021

Deep learning: a statistical viewpoint

The remarkable practical success of deep learning has revealed some majo...

13 Peter L. Bartlett, et al. ∙

research

∙ 02/09/2021

When does gradient descent with logistic loss interpolate using deep networks with smoothed ReLU activations?

We establish conditions under which gradient descent applied to fixed-wi...

0 Niladri S. Chatterji, et al. ∙

research

∙ 12/04/2020

When does gradient descent with logistic loss find interpolating two-layer networks?

We study the training of finite-width two-layer smoothed ReLU networks f...

0 Niladri S. Chatterji, et al. ∙

research

∙ 11/24/2020

Optimal Mean Estimation without a Variance

We study the problem of heavy-tailed mean estimation in settings where t...

10 Yeshwanth Cherapanamjeri, et al. ∙

research

∙ 10/16/2020

Failures of model-dependent generalization bounds for least-norm interpolation

We consider bounds on the generalization performance of the least-norm l...

0 Peter L. Bartlett, et al. ∙

research

∙ 07/16/2020

Optimal Robust Linear Regression in Nearly Linear Time

We study the problem of high-dimensional robust linear regression where ...

0 Yeshwanth Cherapanamjeri, et al. ∙

research

∙ 04/09/2020

On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration

We undertake a precise study of the asymptotic and non-asymptotic proper...

3 Wenlong Mou, et al. ∙

research

∙ 02/23/2020

On Thompson Sampling with Langevin Algorithms

Thompson sampling is a methodology for multi-armed bandit problems that ...

9 Eric Mazumdar, et al. ∙

research

∙ 02/13/2020

Self-Distillation Amplifies Regularization in Hilbert Space

Knowledge distillation introduced in the deep learning context is a meth...

15 Hossein Mobahi, et al. ∙

research

∙ 02/01/2020

Oracle lower bounds for stochastic gradient sampling algorithms

We consider the problem of sampling from a strongly log-concave density ...

0 Niladri S. Chatterji, et al. ∙

research

∙ 12/11/2019

Sampling for Bayesian Mixture Models: MCMC with Polynomial-Time Mixing

We study the problem of sampling from the power posterior distribution i...

27 Wenlong Mou, et al. ∙

research

∙ 11/17/2019

Hebbian Synaptic Modifications in Spiking Neurons that Learn

In this paper, we derive a new model of synaptic plasticity, based on re...

0 Peter L. Bartlett, et al. ∙

research

∙ 10/09/2019

Learning Near-optimal Convex Combinations of Basis Models with Generalization Guarantees

The problem of learning an optimal convex combination of basis models ha...

0 Tan Nguyen, et al. ∙

research

∙ 10/01/2019

An Efficient Sampling Algorithm for Non-smooth Composite Potentials

We consider the problem of sampling from a density of the form p(x) ∝(-f...

24 Wenlong Mou, et al. ∙

research

∙ 08/28/2019

High-Order Langevin Diffusion Yields an Accelerated MCMC Algorithm

We propose a Markov chain Monte Carlo (MCMC) algorithm based on third-or...

12 Wenlong Mou, et al. ∙

research

∙ 07/27/2019

Bayesian Robustness: A Nonasymptotic Viewpoint

We study the problem of robustly estimating the posterior distribution f...

18 Kush Bhatia, et al. ∙

research

∙ 07/25/2019

Improved Bounds for Discretization of Langevin Diffusions: Near-Optimal Rates without Convexity

We present an improved analysis of the Euler-Maruyama discretization of ...

1 Wenlong Mou, et al. ∙

research

∙ 07/07/2019

Quantitative W_1 Convergence of Langevin-Like Stochastic Processes with Non-Convex Potential State-Dependent Noise

We prove quantitative convergence rates at which discrete Langevin-like ...

3 Xiang Cheng, et al. ∙

research

∙ 06/26/2019

Benign Overfitting in Linear Regression

The phenomenon of benign overfitting is one of the key mysteries uncover...

8 Peter L. Bartlett, et al. ∙

research

∙ 05/30/2019

Langevin Monte Carlo without Smoothness

Langevin Monte Carlo (LMC) is an iterative algorithm used to generate sa...

0 Niladri S. Chatterji, et al. ∙

research

∙ 05/24/2019

OSOM: A Simultaneously Optimal Algorithm for Multi-Armed and Linear Contextual Bandits

We consider the stochastic linear (multi-armed) contextual bandit proble...

0 Niladri S. Chatterji, et al. ∙

research

∙ 02/06/2019

Testing Markov Chains without Hitting

We study the problem of identity testing of markov chains. In this setti...

0 Yeshwanth Cherapanamjeri, et al. ∙

research

∙ 02/06/2019

Fast Mean Estimation with Sub-Gaussian Rates

We propose an estimator for the mean of a random vector in R^d that can ...

0 Yeshwanth Cherapanamjeri, et al. ∙

research

∙ 02/03/2019

Quantitative Central Limit Theorems for Discrete Stochastic Processes

In this paper, we establish a generalization of the classical Central Li...

0 Xiang Cheng, et al. ∙

research

∙ 01/06/2019

Large-Scale Markov Decision Problems via the Linear Programming Dual

We consider the problem of controlling a fully specified Markov decision...

0 Yasin Abbasi-Yadkori, et al. ∙

research

∙ 12/20/2018

Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems

We study derivative-free methods for policy optimization over the class ...

0 Dhruv Malik, et al. ∙

research

∙ 11/20/2018

Gen-Oja: A Simple and Efficient Algorithm for Streaming Generalized Eigenvector Computation

In this paper, we study the problems of principal Generalized Eigenvecto...

0 Kush Bhatia, et al. ∙

research

∙ 10/01/2018

A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption

We study the problem of optimizing a function under a budgeted number of...

0 Peter L. Bartlett, et al. ∙

research

∙ 05/22/2018

Best of many worlds: Robust model selection for online supervised learning

We introduce algorithms for online, full-information prediction that are...

0 Vidya Muthukumar, et al. ∙

research

∙ 05/04/2018

Sharp Convergence Rates for Langevin Dynamics in the Nonconvex Setting

We study the problem of sampling from a distribution where the negative ...

0 Xiang Cheng, et al. ∙

research

∙ 04/13/2018

Representing smooth functions as compositions of near-identity functions with implications for deep network optimization

We show that any smooth bi-Lipschitz h can be represented exactly as a c...

0 Peter L. Bartlett, et al. ∙

research

∙ 02/27/2018

Online learning with kernel losses

We present a generalization of the adversarial linear bandits framework,...

0 Aldo Pacchiano, et al. ∙

research

∙ 02/16/2018

Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks

We analyze algorithms for approximating a function f(x) = Φ x mapping ^d...

0 Peter L. Bartlett, et al. ∙

research

∙ 02/15/2018

On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo

We provide convergence guarantees in Wasserstein distance for a variety ...

0 Niladri S. Chatterji, et al. ∙

Peter L. Bartlett

Featured Co-authors

Sign in with Google

Consider DeepAI Pro