
Minimax Optimization with Smooth Algorithmic Adversaries
This paper considers minimax optimization min_x max_y f(x, y) in the cha...
read it

Nearoptimal Offline and Streaming Algorithms for Learning NonLinear Dynamical Systems
We consider the setting of vector valued nonlinear dynamical systems X_...
read it

Sample Efficient Linear MetaLearning by Alternating Minimization
Metalearning synthesizes and leverages the knowledge from a given set o...
read it

Streaming Linear System Identification with Reverse Experience Replay
We consider the problem of estimating a stochastic linear timeinvariant...
read it

Do Input Gradients Highlight Discriminative Features?
Interpretability methods that seek to explain instancespecific model pr...
read it

Projection Efficient Subgradient Method and Optimal Nonsmooth FrankWolfe Method
We consider the classical setting of optimizing a nonsmooth Lipschitz co...
read it

No quantum speedup over gradient descent for nonsmooth convex optimization
We study the firstorder convex optimization problem, where we have blac...
read it

Learning Minimax Estimators via Online Learning
We consider the problem of designing minimax estimators for estimating t...
read it

Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms
We study the problem of least squares linear regression where the datap...
read it

The Pitfalls of Simplicity Bias in Neural Networks
Several works have proposed Simplicity Bias (SB)—the tendency of standar...
read it

Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for Smooth Minimax Games
We consider the problem of online learning and its application to solvin...
read it

MOReL : ModelBased Offline Reinforcement Learning
In offline reinforcement learning (RL), the goal is to learn a successfu...
read it

Efficient Domain Generalization via CommonSpecific LowRank Decomposition
Domain generalization refers to the task of training a model which gener...
read it

NonGaussianity of Stochastic Gradient Noise
What enables Stochastic Gradient Descent (SGD) to achieve better general...
read it

Efficient Algorithms for Smooth Minimax Optimization
This paper studies first order methods for solving smooth minimax optimi...
read it

The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure
There is a stark disparity between the step size schedules used in pract...
read it

Making the Last Iterate of SGD Information Theoretically Optimal
Stochastic gradient descent (SGD) is one of the most widely used algorit...
read it

Online NonConvex Learning: Following the Perturbed Leader is Optimal
We study the problem of online learning with nonconvex losses, where th...
read it

SGD without Replacement: Sharper Rates for General Smooth Convex Functions
We study stochastic gradient descent without replacement () for smooth ...
read it

Stochastic Gradient Descent Escapes Saddle Points Efficiently
This paper considers the perturbed stochastic gradient descent algorithm...
read it

A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm
In this note, we derive concentration inequalities for random vectors wi...
read it

Minmax Optimization: Stable Limit Points of Gradient Descent Ascent are Locally Optimal
Minmax optimization, especially in its general nonconvexnonconcave form...
read it

On the insufficiency of existing momentum schemes for Stochastic Optimization
Momentum based stochastic gradient methods such as heavy ball (HB) and N...
read it

Smoothed analysis for lowrank solutions to semidefinite programs in quadratic penalty form
Semidefinite programs (SDP) are important in learning and combinatorial ...
read it

Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent
Nesterov's accelerated gradient descent (AGD), an instance of the genera...
read it

Leverage Score Sampling for Faster Accelerated Regression and ERM
Given a matrix A∈R^n× d and a vector b ∈R^d, we show how to compute an ϵ...
read it

A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)
This work provides a simplified proof of the statistical minimax optimal...
read it

Accelerating Stochastic Gradient Descent
There is widespread sentiment that it is not possible to effectively uti...
read it

How to Escape Saddle Points Efficiently
This paper shows that a perturbed form of gradient descent converges to ...
read it

Parallelizing Stochastic Approximation Through MiniBatching and TailAveraging
This work characterizes the benefits of averaging techniques widely used...
read it

Provable Efficient Online Matrix Completion via Nonconvex Stochastic Gradient Descent
Matrix completion, where we wish to recover a low rank matrix by observi...
read it

Efficient Algorithms for Largescale Generalized Eigenvector Computation and Canonical Correlation Analysis
This paper considers the problem of canonicalcorrelation analysis (CCA)...
read it

Streaming PCA: Matching Matrix Bernstein and NearOptimal Finite Sample Guarantees for Oja's Algorithm
This work provides improved guarantees for streaming principle component...
read it

Convergence Rates of Active Learning for Maximum Likelihood Estimation
An active learner is given a class of models, a large set of unlabeled e...
read it

Learning Planar Ising Models
Inference and learning of graphical models are both wellstudied problem...
read it

Fast Exact Matrix Completion with Finite Samples
Matrix completion is the problem of recovering a low rank matrix by obse...
read it

Nonconvex Robust PCA
We propose a new method for robust PCA  the task of recovering a lowr...
read it

A Clustering Approach to Learn SparselyUsed Overcomplete Dictionaries
We consider the problem of learning overcomplete dictionaries in the con...
read it

Phase Retrieval using Alternating Minimization
Phase retrieval problems involve solving linear equations, but with miss...
read it

Lowrank Matrix Completion using Alternating Minimization
Alternating minimization represents a widely applicable and empirically ...
read it

Greedy Learning of Markov Network Structure
We propose a new yet natural algorithm for learning the graph structure ...
read it

Finding the Graph of Epidemic Cascades
We consider the problem of finding the graph on which an epidemic cascad...
read it