b'Yair Carmon'

research

∙ 05/22/2023

Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond

Recent research shows that when Gradient Descent (GD) is applied to neur...

0 Itai Kreisler, et al. ∙

research

∙ 04/27/2023

DataComp: In search of the next generation of multimodal datasets

Large multimodal datasets have been instrumental in recent breakthroughs...

0 Samir Yitzhak Gadre, et al. ∙

research

∙ 01/01/2023

ReSQueing Parallel and Private Stochastic Convex Optimization

We introduce a new tool for stochastic convex optimization (SCO): a Rewe...

0 Yair Carmon, et al. ∙

research

∙ 11/28/2022

Malign Overfitting: Interpolation Can Provably Preclude Invariance

Learned classifiers should often possess certain invariance properties m...

0 Yoav Wald, et al. ∙

research

∙ 06/17/2022

RECAPP: Crafting a More Efficient Catalyst for Convex Optimization

The accelerated proximal point algorithm (APPA), also known as "Catalyst...

0 Yair Carmon, et al. ∙

research

∙ 05/30/2022

Optimal and Adaptive Monteiro-Svaiter Acceleration

We develop a variant of the Monteiro-Svaiter (MS) acceleration framework...

0 Yair Carmon, et al. ∙

research

∙ 05/04/2022

Making SGD Parameter-Free

We develop an algorithm for parameter-free stochastic convex optimizatio...

0 Yair Carmon, et al. ∙

research

∙ 03/24/2022

Distributionally Robust Optimization via Ball Oracle Acceleration

We develop and analyze algorithms for distributionally robust optimizati...

0 Yair Carmon, et al. ∙

research

∙ 03/10/2022

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

The conventional recipe for maximizing model accuracy is to (1) train mu...

10 Mitchell Wortsman, et al. ∙

research

∙ 02/13/2022

Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments

Neural scaling laws define a predictable relationship between a model's ...

0 Maor Ivgi, et al. ∙

research

∙ 07/09/2021

Accuracy on the Line: On the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization

For machine learning systems to be reliable, we must understand their pe...

0 John Miller, et al. ∙

research

∙ 06/29/2021

Never Go Full Batch (in Stochastic Convex Optimization)

We study the generalization performance of full-batch optimization algor...

0 Idan Amir, et al. ∙

research

∙ 06/17/2021

Stochastic Bias-Reduced Gradient Methods

We develop a new primitive for stochastic optimization: a low-bias, low-...

0 Hilal Asi, et al. ∙

research

∙ 05/04/2021

Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss

We characterize the complexity of minimizing max_i∈[N] f_i(x) for convex...

0 Yair Carmon, et al. ∙

research

∙ 10/12/2020

Large-Scale Methods for Distributionally Robust Optimization

We propose and analyze algorithms for distributionally robust optimizati...

0 Daniel Lévy, et al. ∙

research

∙ 09/17/2020

Coordinate Methods for Matrix Games

We develop primal-dual coordinate methods for solving bilinear saddle-po...

0 Yair Carmon, et al. ∙

research

∙ 06/24/2020

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

We design an algorithm which finds an ϵ-approximate stationary point (wi...

0 Yossi Arjevani, et al. ∙

research

∙ 03/18/2020

Acceleration with a Ball Optimization Oracle

Consider an oracle which takes a point x and returns the minimizer of a ...

0 Yair Carmon, et al. ∙

research

∙ 12/05/2019

Lower Bounds for Non-Convex Stochastic Optimization

We lower bound the complexity of finding ϵ-stationary points (with gradi...

0 Yossi Arjevani, et al. ∙

research

∙ 07/03/2019

Variance Reduction for Matrix Games

We present a randomized primal-dual algorithm that solves the problem _x...

0 Yair Carmon, et al. ∙

research

∙ 05/31/2019

Unlabeled Data Improves Adversarial Robustness

We demonstrate, theoretically and empirically, that adversarial robustne...

0 Yair Carmon, et al. ∙

research

∙ 03/07/2019

A Rank-1 Sketch for Matrix Multiplicative Weights

We show that a simple randomized sketch of the matrix multiplicative wei...

0 Yair Carmon, et al. ∙

research

∙ 05/26/2016

No bad local minima: Data independent training error guarantees for multilayer neural networks

We use smoothed analysis techniques to provide guarantees on the trainin...

0 Daniel Soudry, et al. ∙

Yair Carmon

Featured Co-authors

Sign in with Google

Consider DeepAI Pro