b'Leon Bottou'

research

∙ 06/01/2023

Birth of a Transformer: A Memory Viewpoint

Large language models based on transformers have achieved great empirica...

0 Alberto Bietti, et al. ∙

research

∙ 03/27/2023

Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need

Self-Supervised Learning (SSL) has emerged as the solution of choice to ...

0 Vivien Cabannes, et al. ∙

research

∙ 12/20/2022

Recycling diverse models for out-of-distribution generalization

Foundation models are redefining how AI systems are built. Practitioners...

0 Alexandre Ramé, et al. ∙

research

∙ 12/14/2022

Learning useful representations for shifting tasks and distributions

Does the dominant approach to learn representations (as a side effect of...

0 Jianyu Zhang, et al. ∙

research

∙ 04/07/2022

The Effects of Regularization and Data Augmentation are Class Dependent

Regularization is a fundamental technique to prevent over-fitting and to...

8 Randall Balestriero, et al. ∙

research

∙ 03/24/2022

Rich Feature Construction for the Optimization-Generalization Dilemma

There often is a dilemma between ease of optimization and robust out-of-...

0 Jianyu Zhang, et al. ∙

research

∙ 06/17/2021

An Attract-Repel Decomposition of Undirected Networks

Dot product latent space embedding is a common form of representation le...

0 Alexander Peysakhovich, et al. ∙

research

∙ 03/05/2020

On the Convergence of Adam and Adagrad

We provide a simple proof of the convergence of the optimization algorit...

19 Alexandre Défossez, et al. ∙

research

∙ 11/27/2019

Music Source Separation in the Waveform Domain

Source separation for music is the task of isolating contributions, or s...

0 Alexandre Défossez, et al. ∙

research

∙ 09/29/2019

Symplectic Recurrent Neural Networks

We propose Symplectic Recurrent Neural Networks (SRNNs) as learning algo...

0 Zhengdao Chen, et al. ∙

research

∙ 09/03/2019

Demucs: Deep Extractor for Music Sources with extra unlabeled data remixed

We study the problem of source separation for music using deep learning ...

5 Alexandre Défossez, et al. ∙

research

∙ 07/05/2019

Invariant Risk Minimization

We introduce Invariant Risk Minimization (IRM), a learning paradigm to e...

1 Martin Arjovsky, et al. ∙

research

∙ 06/10/2019

Scaling Laws for the Principled Design, Initialization and Preconditioning of ReLU Networks

In this work, we describe a set of rules for the design and initializati...

1 Aaron Defazio, et al. ∙

research

∙ 05/25/2019

Cold Case: The Lost MNIST Digits

Although the popular MNIST dataset [LeCun et al., 1994] is derived from ...

0 Chhavi Yadav, et al. ∙

research

∙ 12/11/2018

Controlling Covariate Shift using Equilibrium Normalization of Weights

We introduce a new normalization technique that exhibits the fast conver...

1 Aaron Defazio, et al. ∙

research

∙ 12/11/2018

On the Ineffectiveness of Variance Reduced Optimization for Deep Learning

The application of stochastic variance reduction to optimization has sho...

1 Aaron Defazio, et al. ∙

research

∙ 10/23/2018

SING: Symbol-to-Instrument Neural Generator

Recent progress in deep learning for audio synthesis opens the way to mo...

0 Alexandre Défossez, et al. ∙

research

∙ 06/05/2018

AdaGrad stepsizes: Sharp convergence over nonconvex landscapes, from any initialization

Adaptive gradient methods such as AdaGrad and its variants update the st...

0 Rachel Ward, et al. ∙

research

∙ 03/07/2018

WNGrad: Learn the Learning Rate in Gradient Descent

Adjusting the learning rate schedule in stochastic gradient methods is a...

0 Xiaoxia Wu, et al. ∙

research

∙ 02/05/2018

Adversarial Vulnerability of Neural Networks Increases With Input Dimension

Over the past four years, neural networks have proven vulnerable to adve...

0 Carl-Johann Simon-Gabriel, et al. ∙

research

∙ 12/21/2017

Geometrical Insights for Implicit Generative Modeling

Learning algorithms for implicit generative models can optimize a variet...

0 Leon Bottou, et al. ∙

research

∙ 05/25/2017

Diagonal Rescaling For Neural Networks

We define a second-order neural network stochastic gradient training alg...

0 Jean Lafond, et al. ∙

research

∙ 01/26/2017

Wasserstein GAN

We introduce a new algorithm named WGAN, an alternative to traditional G...

0 Martin Arjovsky, et al. ∙

research

∙ 01/17/2017

Towards Principled Methods for Training Generative Adversarial Networks

The goal of this paper is not to introduce a single algorithm or method,...

0 Martin Arjovsky, et al. ∙

research

∙ 11/22/2016

Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond

We look at the eigenvalues of the Hessian of a loss function before and ...

0 Levent Sagun, et al. ∙

research

∙ 06/15/2016

Optimization Methods for Large-Scale Machine Learning

This paper provides a review and commentary on the past, present, and fu...

0 Leon Bottou, et al. ∙

research

∙ 05/26/2016

Discovering Causal Signals in Images

This paper establishes the existence of observable footprints that revea...

0 David Lopez-Paz, et al. ∙

research

∙ 11/11/2015

Unifying distillation and privileged information

Distillation (Hinton et al., 2015) and privileged information (Vapnik & ...

0 David Lopez-Paz, et al. ∙

research

∙ 08/12/2015

No Regret Bound for Extreme Bandits

Algorithms for hyperparameter optimization abound, all of which work wel...

0 Robert Nishihara, et al. ∙

research

∙ 10/02/2014

A Lower Bound for the Optimization of Finite Sums

This paper presents a lower bound for optimizing a finite sum of n funct...

0 Alekh Agarwal, et al. ∙

research

∙ 09/16/2014

ICE: Enabling Non-Experts to Build Models Interactively for Large-Scale Lopsided Problems

Quick interaction between a human teacher and a learning machine present...

0 Patrice Simard, et al. ∙

research

∙ 09/11/2012

Counterfactual Reasoning and Learning Systems

This work shows how to leverage causal inference to understand the behav...

0 Leon Bottou, et al. ∙

research

∙ 02/09/2011

From Machine Learning to Machine Reasoning

A plausible definition of "reasoning" could be "algebraically manipulati...

0 Leon Bottou, et al. ∙

Leon Bottou

Featured Co-authors

Sign in with Google

Consider DeepAI Pro