Ohad Shamir

research

∙ 07/10/2023

An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization

We study the complexity of producing (δ,ϵ)-stationary points of Lipschit...

0 Guy Kornowski, et al. ∙

research

∙ 05/25/2023

Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks

We provide several new results on the sample complexity of vector-valued...

0 Roey Magen, et al. ∙

research

∙ 05/24/2023

From Tempered to Benign Overfitting in ReLU Neural Networks

Overparameterized neural networks (NNs) are observed to generalize well ...

0 Guy Kornowski, et al. ∙

research

∙ 02/16/2023

Deterministic Nonsmooth Nonconvex Optimization

We study the complexity of optimizing nonsmooth nonconvex Lipschitz func...

0 Michael I. Jordan, et al. ∙

research

∙ 09/21/2022

On the Complexity of Finding Small Subgradients in Nonsmooth Optimization

We study the oracle complexity of producing (δ,ϵ)-stationary points of L...

0 Guy Kornowski, et al. ∙

research

∙ 06/15/2022

Reconstructing Training Data from Trained Neural Networks

Understanding to what extent neural networks memorize training data is a...

160 Niv Haim, et al. ∙

research

∙ 02/13/2022

The Sample Complexity of One-Hidden-Layer Neural Networks

We study norm-based uniform convergence bounds for neural networks, aimi...

0 Gal Vardi, et al. ∙

research

∙ 02/09/2022

Gradient Methods Provably Converge to Non-Robust Networks

Despite a great deal of research, it is still unclear why neural network...

0 Gal Vardi, et al. ∙

research

∙ 02/08/2022

Width is Less Important than Depth in ReLU Neural Networks

We solve an open question from Lu et al. (2017), by showing that any tar...

0 Gal Vardi, et al. ∙

research

∙ 01/30/2022

Implicit Regularization Towards Rank Minimization in ReLU Networks

We study the conjectured relationship between the implicit regularizatio...

0 Nadav Timor, et al. ∙

research

∙ 01/27/2022

The Implicit Bias of Benign Overfitting

The phenomenon of benign overfitting, where a predictor perfectly fits n...

0 Ohad Shamir, et al. ∙

research

∙ 12/08/2021

Replay For Safety

Experience replay <cit.> is a widely used technique to achieve efficient...

0 Liran Szlak, et al. ∙

research

∙ 12/08/2021

Convergence Results For Q-Learning With Experience Replay

A commonly used heuristic in RL is experience replay (e.g. <cit.>), in w...

0 Liran Szlak, et al. ∙

research

∙ 10/07/2021

A Stochastic Newton Algorithm for Distributed Convex Optimization

We propose and analyze a stochastic Newton algorithm for homogeneous dis...

0 Brian Bullins, et al. ∙

research

∙ 10/07/2021

On the Optimal Memorization Power of ReLU Neural Networks

We study the memorization power of feedforward ReLU neural networks. We ...

0 Gal Vardi, et al. ∙

research

∙ 10/06/2021

On Margin Maximization in Linear and ReLU Networks

The implicit bias of neural networks has been extensively studied in rec...

0 Gal Vardi, et al. ∙

research

∙ 06/12/2021

Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned Problems

Recently, there has been much interest in studying the convergence rates...

0 Itay Safran, et al. ∙

research

∙ 06/02/2021

Learning a Single Neuron with Bias Using Gradient Descent

We theoretically study the fundamental problem of learning a single neur...

0 Gal Vardi, et al. ∙

research

∙ 04/14/2021

Oracle Complexity in Nonsmooth Nonconvex Optimization

It is well-known that given a smooth, bounded-from-below, and possibly n...

0 Guy Kornowski, et al. ∙

research

∙ 02/02/2021

The Min-Max Complexity of Distributed Stochastic Convex Optimization with Intermittent Communication

We resolve the min-max complexity of distributed stochastic convex optim...

0 Blake Woodworth, et al. ∙

research

∙ 01/31/2021

The Connection Between Approximation, Depth Separation and Learnability in Neural Networks

Several recent works have shown separation results between deep neural n...

0 Eran Malach, et al. ∙

research

∙ 01/30/2021

Size and Depth Separation in Approximating Natural Functions with Neural Networks

When studying the expressive power of neural networks, a main challenge ...

5 Gal Vardi, et al. ∙

research

∙ 12/09/2020

Implicit Regularization in ReLU Networks with the Square Loss

Understanding the implicit regularization (or implicit bias) of gradient...

0 Gal Vardi, et al. ∙

research

∙ 10/13/2020

High-Order Oracle Complexity of Smooth and Strongly Convex Optimization

In this note, we consider the complexity of optimizing a highly smooth (...

1 Guy Kornowski, et al. ∙

research

∙ 06/30/2020

Gradient Methods Never Overfit On Separable Data

A line of recent works established that when training linear predictors ...

0 Ohad Shamir, et al. ∙

research

∙ 06/01/2020

The Effects of Mild Over-parameterization on the Optimization Landscape of Shallow ReLU Neural Networks

We study the effects of mild over-parameterization on the optimization l...

29 Itay Safran, et al. ∙

research

∙ 05/31/2020

Neural Networks with Small Weights and Depth-Separation Barriers

In studying the expressiveness of neural networks, an important question...

0 Gal Vardi, et al. ∙

research

∙ 02/27/2020

Can We Find Near-Approximately-Stationary Points of Nonsmooth Nonconvex Functions?

It is well-known that given a bounded, smooth nonconvex function, standa...

0 Ohad Shamir, et al. ∙

research

∙ 02/18/2020

Is Local SGD Better than Minibatch SGD?

We study local SGD (also known as parallel SGD and federated averaging),...

5 Blake Woodworth, et al. ∙

research

∙ 02/03/2020

Proving the Lottery Ticket Hypothesis: Pruning is All You Need

The lottery ticket hypothesis (Frankle and Carbin, 2018), states that a ...

7 Eran Malach, et al. ∙

research

∙ 01/15/2020

Learning a Single Neuron with Gradient Methods

We consider the fundamental problem of learning a single neuron x σ(w^ x...

0 Gilad Yehudai, et al. ∙

research

∙ 10/04/2019

The Complexity of Finding Stationary Points with Stochastic Gradient Descent

We study the iteration complexity of stochastic gradient descent (SGD) f...

0 Yoel Drori, et al. ∙

research

∙ 07/31/2019

How Good is SGD with Random Shuffling?

We study the performance of stochastic gradient descent (SGD) on smooth ...

4 Itay Safran, et al. ∙

research

∙ 04/15/2019

Depth Separations in Neural Networks: What is Actually Being Separated?

Existing depth separation results for constant-depth networks essentiall...

10 Itay Safran, et al. ∙

research

∙ 04/01/2019

On the Power and Limitations of Random Features for Understanding Neural Networks

Recently, a spate of papers have provided positive theoretical results f...

0 Gilad Yehudai, et al. ∙

research

∙ 02/13/2019

The Complexity of Making the Gradient Small in Stochastic Convex Optimization

We give nearly matching upper and lower bounds on the oracle complexity ...

0 Dylan Foster, et al. ∙

research

∙ 02/09/2019

Space lower bounds for linear prediction

We show that fundamental learning tasks, such as finding an approximate ...

0 Yuval Dagan, et al. ∙

research

∙ 10/29/2018

Global Non-convex Optimization with Discretized Diffusions

An Euler discretization of the Langevin diffusion is known to converge t...

0 Murat A. Erdogdu, et al. ∙

research

∙ 09/23/2018

Exponential Convergence Time of Gradient Descent for One-Dimensional Deep Linear Neural Networks

In this note, we study the dynamics of gradient descent on objective fun...

0 Ohad Shamir, et al. ∙

research

∙ 06/26/2018

A Tight Convergence Analysis for Stochastic Gradient Descent with Delayed Updates

We provide tight finite-time convergence bounds for gradient descent and...

0 Yossi Arjevani, et al. ∙

research

∙ 04/18/2018

Are ResNets Provably Better than Linear Predictors?

A residual network (or ResNet) is a standard deep neural net architectur...

0 Ohad Shamir, et al. ∙

research

∙ 03/04/2018

Detecting Correlations with Little Memory and Communication

We study the problem of identifying correlations in multivariate data, u...

0 Yuval Dagan, et al. ∙

research

∙ 12/24/2017

Spurious Local Minima are Common in Two-Layer ReLU Neural Networks

We consider the optimization problem associated with training simple ReL...

0 Itay Safran, et al. ∙

research

∙ 12/18/2017

Size-Independent Sample Complexity of Neural Networks

We study the sample complexity of learning neural networks, by providing...

0 Noah Golowich, et al. ∙

research

∙ 05/15/2017

Bandit Regret Scaling with the Effective Loss Range

We study how the regret guarantees of nonstochastic multi-armed bandits ...

0 Nicolò Cesa-Bianchi, et al. ∙

research

∙ 03/23/2017

Failures of Gradient-Based Deep Learning

In recent years, Deep Learning has become the go-to solution for a broad...

0 Shai Shalev-Shwartz, et al. ∙

research

∙ 11/15/2016

Oracle Complexity of Second-Order Methods for Finite-Sum Problems

Finite-sum optimization problems are ubiquitous in machine learning, and...

0 Yossi Arjevani, et al. ∙

research

∙ 10/31/2016

Depth-Width Tradeoffs in Approximating Natural Functions with Neural Networks

We provide several new depth-based separation results for feed-forward n...

0 Itay Safran, et al. ∙

research

∙ 09/05/2016

Distribution-Specific Hardness of Learning Neural Networks

Although neural networks are routinely and successfully trained in pract...

0 Ohad Shamir, et al. ∙

research

∙ 03/02/2016

Without-Replacement Sampling for Stochastic Gradient Methods: Convergence Results and Application to Distributed Optimization

Stochastic gradient methods for machine learning and optimization proble...

0 Ohad Shamir, et al. ∙

Ohad Shamir

Featured Co-authors

Sign in with Google

Consider DeepAI Pro