
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
The lottery ticket hypothesis (Frankle and Carbin, 2018), states that a ...
Discriminative Active Learning
We propose a new batch mode active learning algorithm designed for neura...
On a Formal Model of Safe and Scalable Selfdriving Cars
In recent years, car makers and tech companies have been racing towards ...
Failures of GradientBased Deep Learning
In recent years, Deep Learning has become the goto solution for a broad...
Safe, MultiAgent, Reinforcement Learning for Autonomous Driving
Autonomous driving is a multiagent setting where the host vehicle must ...
Learning a Metric Embedding for Face Recognition using the Multibatch Method
This work is motivated by the engineering task of achieving a near state...
SelfieBoost: A Boosting Algorithm for Deep Learning
We describe and analyze a new boosting algorithm for deep learning calle...
On the Computational Efficiency of Training Neural Networks
It is wellknown that neural networks are computationally hard to train....
Learning the Experts for Online Sequence Prediction
Online sequence prediction is the problem of predicting the next element...
Subspace Learning with Partial Information
The goal of subspace learning is to find a kdimensional subspace of R^d...
Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization
We introduce a proximal version of the stochastic dual coordinate ascent...
Accelerated MiniBatch Stochastic Dual Coordinate Ascent
Stochastic dual coordinate ascent (SDCA) is an effective technique for s...
An Algorithm for Training Polynomial Networks
We consider deep neural networks, in which the output of each node is a ...
Learning Sparse LowThreshold Linear Classifiers
We consider the problem of learning a nonnegative linear classifier wit...
Proximal Stochastic Dual Coordinate Ascent
We introduce a proximal version of dual coordinate ascent method. We dem...
Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization
Stochastic Gradient Descent (SGD) has become popular for solving large s...
Active Learning of Halfspaces under a Margin Assumption
We derive and analyze a new, efficient, poolbased active learning algor...
LargeScale Convex Minimization with a LowRank Constraint
We address the problem of minimizing a convex function over the space of...
Using More Data to Speedup Training Time
In many recent applications, data is plentiful. By now, we have a rather...
Regularization Techniques for Learning with Matrices
There is growing body of learning problems for which it is natural to or...
ShareBoost: Efficient Multiclass Learning with Feature Sharing
Multiclass prediction is the problem of classifying an object into a rel...
A Provably Correct Algorithm for Deep Learning that Actually Works
We describe a layerbylayer algorithm for training deep convolutional n...
Vision Zero: on a Provable Method for Eliminating Roadway Accidents without Compromising Traffic Throughput
We propose an economical, viable, approach to eliminate almost all car a...
Is Deeper Better only when Shallow is Good?
Understanding the power of depth in feedforward neural networks is an o...
Decoupling Gating from Linearity
ReLU neuralnetworks have been in the focus of many recent theoretical w...
SenseBERT: Driving Some Sense into BERT
Selfsupervision techniques have allowed neural language models to advan...
The Implicit Bias of Depth: How Incremental Learning Drives Generalization
A leading hypothesis for the surprising generalization of neural network...
Learning Boolean Circuits with Neural Networks
Training neuralnetworks is computationally hard. However, in practice t...
Shai ShalevShwartz
Professor at the School of Computer Science and Engineering at The Hebrew University of Jerusalem