Recent research shows that when Gradient Descent (GD) is applied to neur...
Large multimodal datasets have been instrumental in recent breakthroughs...
We introduce a new tool for stochastic convex optimization (SCO): a
Rewe...
Learned classifiers should often possess certain invariance properties m...
The accelerated proximal point algorithm (APPA), also known as "Catalyst...
We develop a variant of the Monteiro-Svaiter (MS) acceleration framework...
We develop an algorithm for parameter-free stochastic convex optimizatio...
We develop and analyze algorithms for distributionally robust optimizati...
The conventional recipe for maximizing model accuracy is to (1) train
mu...
Neural scaling laws define a predictable relationship between a model's
...
For machine learning systems to be reliable, we must understand their
pe...
We study the generalization performance of full-batch optimization
algor...
We develop a new primitive for stochastic optimization: a low-bias, low-...
We characterize the complexity of minimizing max_i∈[N] f_i(x) for
convex...
We propose and analyze algorithms for distributionally robust optimizati...
We develop primal-dual coordinate methods for solving bilinear saddle-po...
We design an algorithm which finds an ϵ-approximate stationary point
(wi...
Consider an oracle which takes a point x and returns the minimizer of a
...
We lower bound the complexity of finding ϵ-stationary points (with
gradi...
We present a randomized primal-dual algorithm that solves the problem
_x...
We demonstrate, theoretically and empirically, that adversarial robustne...
We show that a simple randomized sketch of the matrix multiplicative wei...
We use smoothed analysis techniques to provide guarantees on the trainin...