
Learning with Random Learning Rates
Hyperparameter tuning is a bothersome step in the training of deep learn...
read it

Making Deep Qlearning methods robust to time discretization
Despite remarkable successes, Deep Reinforcement Learning (DRL) is not r...
read it

Whitebox vs Blackbox: Bayes Optimal Strategies for Membership Inference
Membership inference determines, given a sample and trained parameters o...
read it

Unbiasing Truncated Backpropagation Through Time
Truncated Backpropagation Through Time (truncated BPTT) is a widespread ...
read it

Unbiased Online Recurrent Optimization
The novel Unbiased Online Recurrent Optimization (UORO) algorithm allows...
read it

Training recurrent networks online without backtracking
We introduce the "NoBackTrack" algorithm to train the parameters of dyna...
read it

Autoencoders: reconstruction versus compression
We discuss the similarities and differences between training an autoenc...
read it

Riemannian metrics for neural networks II: recurrent networks and learning symbolic data sequences
Recurrent neural networks are powerful models for sequential data, able ...
read it

Riemannian metrics for neural networks I: feedforward networks
We describe four algorithms for neural network training, each adapted to...
read it

Online Natural Gradient as a Kalman Filter
We establish a full relationship between Kalman filtering and Amari's na...
read it

Practical Riemannian Neural Networks
We provide the first experimental results on nonsynthetic datasets for ...
read it

Speed learning on the fly
The practical performance of online stochastic gradient descent algorith...
read it

Layerwise learning of deep generative models
When using deep, multilayered architectures to build generative models ...
read it

Objective Improvement in InformationGeometric Optimization
InformationGeometric Optimization (IGO) is a unified framework of stoch...
read it

True Asymptotic Natural Gradient Optimization
We introduce a simple algorithm, True Asymptotic Natural Gradient Optimi...
read it

Adversarial Vulnerability of Neural Networks Increases With Input Dimension
Over the past four years, neural networks have proven vulnerable to adve...
read it

Do Deep Learning Models Have Too Many Parameters? An Information Theory Viewpoint
Deep learning models often have more parameters than observations, and s...
read it

Natural Langevin Dynamics for Neural Networks
One way to avoid overfitting in machine learning is to use model paramet...
read it

Approximate Temporal Difference Learning is a Gradient Descent for Reversible Policies
In reinforcement learning, temporal difference (TD) is the most direct a...
read it

Can recurrent neural networks warp time?
Successful recurrent models such as long shortterm memories (LSTMs) and...
read it

Mixed batches and symmetric discriminators for GAN training
Generative adversarial networks (GANs) are pow erful generative models ...
read it

The Extended Kalman Filter is a Natural Gradient Descent in Trajectory Space
The extended Kalman filter is perhaps the most standard tool to estimate...
read it

Separating value functions across timescales
In many finite horizon episodic reinforcement learning (RL) settings, it...
read it

Interpreting a Penalty as the Influence of a Bayesian Prior
In machine learning, it is common to optimize the parameters of a probab...
read it

Convergence of Online Adaptive and Recurrent Optimization Algorithms
We prove local convergence of several notable gradient descentalgorithms...
read it