
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization
Standard dynamics models for continuous control make use of feedforward ...
read it

Image SuperResolution via Iterative Refinement
We present SR3, an approach to image SuperResolution via Repeated Refin...
read it

SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network
We present SpeechStew, a speech recognition model that is trained on a c...
read it

Benchmarks for Deep OffPolicy Evaluation
Offpolicy evaluation (OPE) holds the promise of being able to leverage ...
read it

Big SelfSupervised Models Advance Medical Image Classification
Selfsupervised pretraining followed by supervised finetuning has seen ...
read it

What's in a Loss Function for Image Classification?
It is common to use the softmax crossentropy loss to train neural netwo...
read it

No MCMC for me: Amortized sampling for fast and stable training of energybased models
EnergyBased Models (EBMs) present a flexible and appealing way to repre...
read it

Mastering Atari with Discrete World Models
Intelligent agents need to generalize from past experience to achieve go...
read it

WaveGrad: Estimating Gradients for Waveform Generation
This paper introduces WaveGrad, a conditional model for waveform generat...
read it

RL Unplugged: Benchmarks for Offline Reinforcement Learning
Offline methods for reinforcement learning have the potential to help br...
read it

Big SelfSupervised Models are Strong SemiSupervised Learners
One paradigm for learning from few labeled examples while making best us...
read it

Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
This paper introduces Dynamic Programming Encoding (DPE), a new segmenta...
read it

NonAutoregressive Machine Translation with Latent Alignments
This paper investigates two latent alignment models for nonautoregressi...
read it

Exemplar VAEs for Exemplar based Generation and Data Augmentation
This paper presents a framework for exemplar based generative modeling, ...
read it

NiLBS: Neural Inverse Linear Blend Skinning
In this technical report, we investigate efficient representations of ar...
read it

SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models
Standard variational lower bounds used to train latent variable models p...
read it

Imputer: Sequence Modelling via Imputation and Dynamic Programming
This paper presents the Imputer, a neural sequence model that generates ...
read it

A Simple Framework for Contrastive Learning of Visual Representations
This paper presents SimCLR: a simple framework for contrastive learning ...
read it

Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One
We propose to reinterpret a standard discriminative classifier of p(yx)...
read it

NASA: Neural Articulated Shape Approximation
Efficient representation of articulated objects such as human bodies is ...
read it

Dream to Control: Learning Behaviors by Latent Imagination
Learned world models summarize an agent's experience to facilitate learn...
read it

Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse
Posterior collapse in Variational Autoencoders (VAEs) arises when the va...
read it

Efficient Exploration with SelfImitation Learning via TrajectoryConditioned Policy
This paper proposes a method for learning a trajectoryconditioned polic...
read it

Striving for Simplicity in Offpolicy Deep Reinforcement Learning
Reflecting on the advances of offpolicy deep reinforcement learning (RL...
read it

Similarity of Neural Network Representations Revisited
Recent work has sought to understand the behavior of neural networks by ...
read it

Learning to Generalize from Sparse and Underspecified Rewards
We consider the problem of learning from sparse and underspecified rewar...
read it

Understanding the impact of entropy on policy optimization
Entropy regularization is commonly used to improve policy optimization i...
read it

Understanding the impact of entropy in policy learning
Entropy regularization is commonly used to improve policy optimization i...
read it

ContingencyAware Exploration in Reinforcement Learning
This paper investigates whether learning contingencyawareness and contr...
read it

Sequence to Sequence Mixture Model for Diverse Machine Translation
Sequence to sequence (SEQ2SEQ) models often lack diversity in their gene...
read it

Optimal Completion Distillation for Sequence Learning
We present Optimal Completion Distillation (OCD), a training procedure f...
read it

The Importance of Generation Order in Language Modeling
Neural language models are a critical component of stateoftheart syst...
read it

Memory Augmented Policy Optimization for Program Synthesis with Generalization
This paper presents Memory Augmented Policy Optimization (MAPO): a novel...
read it

Discovery of Latent 3D Keypoints via Endtoend Geometric Reasoning
This paper presents KeypointNet, an endtoend geometric reasoning frame...
read it

Embedding Text in Hyperbolic Spaces
Natural language text exhibits hierarchical structure in a variety of re...
read it

Parallel Architecture and Hyperparameter Search via Successive Halving and Classification
We present a simple and powerful algorithm for parallel black box optimi...
read it

QANet: Combining Local Convolution with Global SelfAttention for Reading Comprehension
Current endtoend machine reading and question answering (Q&A) models a...
read it

Smoothed Action Value Functions for Learning Gaussian Policies
Stateaction value functions (i.e., Qvalues) are ubiquitous in reinforc...
read it

Neural Program Synthesis with Priority Queue Training
We consider the task of program synthesis in the presence of a reward fu...
read it

TrustPCL: An OffPolicy Trust Region Method for Continuous Control
Trust region methods, such as TRPO, are often used to stabilize policy o...
read it

Device Placement Optimization with Reinforcement Learning
The past few years have witnessed a growth in size and computational req...
read it

Filtering Variational Objectives
When used as a surrogate objective for maximum likelihood estimation in ...
read it

PixColor: Pixel Recursive Colorization
We propose a novel approach to automatically produce multiple colorized ...
read it

Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
Generative models in vision have seen rapid progress due to algorithmic ...
read it

Ngram Language Modeling using Recurrent Neural Network Estimation
We investigate the effective memory depth of RNN models by using them fo...
read it

Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs
We approach structured output prediction by optimizing a deep value netw...
read it

Detecting Cancer Metastases on Gigapixel Pathology Images
Each year, the treatment decisions for more than 230,000 breast cancer p...
read it

Bridging the Gap Between Value and Policy Based Reinforcement Learning
We establish a new connection between value and policy based reinforceme...
read it

Pixel Recursive Super Resolution
We present a pixel recursive super resolution model that synthesizes rea...
read it

Neural Combinatorial Optimization with Reinforcement Learning
This paper presents a framework to tackle combinatorial optimization pro...
read it