
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
The lottery ticket hypothesis (Frankle and Carbin, 2018), states that a ...
read it

Discriminative Active Learning
We propose a new batch mode active learning algorithm designed for neura...
read it

On a Formal Model of Safe and Scalable Selfdriving Cars
In recent years, car makers and tech companies have been racing towards ...
read it

Failures of GradientBased Deep Learning
In recent years, Deep Learning has become the goto solution for a broad...
read it

Safe, MultiAgent, Reinforcement Learning for Autonomous Driving
Autonomous driving is a multiagent setting where the host vehicle must ...
read it

Learning a Metric Embedding for Face Recognition using the Multibatch Method
This work is motivated by the engineering task of achieving a near state...
read it

SelfieBoost: A Boosting Algorithm for Deep Learning
We describe and analyze a new boosting algorithm for deep learning calle...
read it

On the Computational Efficiency of Training Neural Networks
It is wellknown that neural networks are computationally hard to train....
read it

Learning the Experts for Online Sequence Prediction
Online sequence prediction is the problem of predicting the next element...
read it

Subspace Learning with Partial Information
The goal of subspace learning is to find a kdimensional subspace of R^d...
read it

Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization
We introduce a proximal version of the stochastic dual coordinate ascent...
read it

Accelerated MiniBatch Stochastic Dual Coordinate Ascent
Stochastic dual coordinate ascent (SDCA) is an effective technique for s...
read it

An Algorithm for Training Polynomial Networks
We consider deep neural networks, in which the output of each node is a ...
read it

Learning Sparse LowThreshold Linear Classifiers
We consider the problem of learning a nonnegative linear classifier wit...
read it

Proximal Stochastic Dual Coordinate Ascent
We introduce a proximal version of dual coordinate ascent method. We dem...
read it

Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization
Stochastic Gradient Descent (SGD) has become popular for solving large s...
read it

Active Learning of Halfspaces under a Margin Assumption
We derive and analyze a new, efficient, poolbased active learning algor...
read it

LargeScale Convex Minimization with a LowRank Constraint
We address the problem of minimizing a convex function over the space of...
read it

Using More Data to Speedup Training Time
In many recent applications, data is plentiful. By now, we have a rather...
read it

Regularization Techniques for Learning with Matrices
There is growing body of learning problems for which it is natural to or...
read it

ShareBoost: Efficient Multiclass Learning with Feature Sharing
Multiclass prediction is the problem of classifying an object into a rel...
read it

A Provably Correct Algorithm for Deep Learning that Actually Works
We describe a layerbylayer algorithm for training deep convolutional n...
read it

Vision Zero: on a Provable Method for Eliminating Roadway Accidents without Compromising Traffic Throughput
We propose an economical, viable, approach to eliminate almost all car a...
read it

Is Deeper Better only when Shallow is Good?
Understanding the power of depth in feedforward neural networks is an o...
read it

Decoupling Gating from Linearity
ReLU neuralnetworks have been in the focus of many recent theoretical w...
read it

SenseBERT: Driving Some Sense into BERT
Selfsupervision techniques have allowed neural language models to advan...
read it

The Implicit Bias of Depth: How Incremental Learning Drives Generalization
A leading hypothesis for the surprising generalization of neural network...
read it

Learning Boolean Circuits with Neural Networks
Training neuralnetworks is computationally hard. However, in practice t...
read it
Shai ShalevShwartz
is this you? claim profile
Professor at the School of Computer Science and Engineering at The Hebrew University of Jerusalem