
RL Unplugged: Benchmarks for Offline Reinforcement Learning
Offline methods for reinforcement learning have the potential to help br...
Big SelfSupervised Models are Strong SemiSupervised Learners
One paradigm for learning from few labeled examples while making best us...
Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
This paper introduces Dynamic Programming Encoding (DPE), a new segmenta...
NonAutoregressive Machine Translation with Latent Alignments
This paper investigates two latent alignment models for nonautoregressi...
Exemplar VAEs for Exemplar based Generation and Data Augmentation
This paper presents a framework for exemplar based generative modeling, ...
NiLBS: Neural Inverse Linear Blend Skinning
In this technical report, we investigate efficient representations of ar...
SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models
Standard variational lower bounds used to train latent variable models p...
Imputer: Sequence Modelling via Imputation and Dynamic Programming
This paper presents the Imputer, a neural sequence model that generates ...
A Simple Framework for Contrastive Learning of Visual Representations
This paper presents SimCLR: a simple framework for contrastive learning ...
Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One
We propose to reinterpret a standard discriminative classifier of p(yx)...
NASA: Neural Articulated Shape Approximation
Efficient representation of articulated objects such as human bodies is ...
Dream to Control: Learning Behaviors by Latent Imagination
Learned world models summarize an agent's experience to facilitate learn...
Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse
Posterior collapse in Variational Autoencoders (VAEs) arises when the va...
Efficient Exploration with SelfImitation Learning via TrajectoryConditioned Policy
This paper proposes a method for learning a trajectoryconditioned polic...
Striving for Simplicity in Offpolicy Deep Reinforcement Learning
Reflecting on the advances of offpolicy deep reinforcement learning (RL...
Similarity of Neural Network Representations Revisited
Recent work has sought to understand the behavior of neural networks by ...
Learning to Generalize from Sparse and Underspecified Rewards
We consider the problem of learning from sparse and underspecified rewar...
Understanding the impact of entropy on policy optimization
Entropy regularization is commonly used to improve policy optimization i...
ContingencyAware Exploration in Reinforcement Learning
This paper investigates whether learning contingencyawareness and contr...
Sequence to Sequence Mixture Model for Diverse Machine Translation
Sequence to sequence (SEQ2SEQ) models often lack diversity in their gene...
Optimal Completion Distillation for Sequence Learning
We present Optimal Completion Distillation (OCD), a training procedure f...
The Importance of Generation Order in Language Modeling
Neural language models are a critical component of stateoftheart syst...
Memory Augmented Policy Optimization for Program Synthesis with Generalization
This paper presents Memory Augmented Policy Optimization (MAPO): a novel...
Discovery of Latent 3D Keypoints via Endtoend Geometric Reasoning
This paper presents KeypointNet, an endtoend geometric reasoning frame...
Embedding Text in Hyperbolic Spaces
Natural language text exhibits hierarchical structure in a variety of re...
Parallel Architecture and Hyperparameter Search via Successive Halving and Classification
We present a simple and powerful algorithm for parallel black box optimi...
QANet: Combining Local Convolution with Global SelfAttention for Reading Comprehension
Current endtoend machine reading and question answering (Q&A) models a...
Smoothed Action Value Functions for Learning Gaussian Policies
Stateaction value functions (i.e., Qvalues) are ubiquitous in reinforc...
Neural Program Synthesis with Priority Queue Training
We consider the task of program synthesis in the presence of a reward fu...
TrustPCL: An OffPolicy Trust Region Method for Continuous Control
Trust region methods, such as TRPO, are often used to stabilize policy o...
Device Placement Optimization with Reinforcement Learning
The past few years have witnessed a growth in size and computational req...
Filtering Variational Objectives
When used as a surrogate objective for maximum likelihood estimation in ...
PixColor: Pixel Recursive Colorization
We propose a novel approach to automatically produce multiple colorized ...
Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
Generative models in vision have seen rapid progress due to algorithmic ...
Ngram Language Modeling using Recurrent Neural Network Estimation
We investigate the effective memory depth of RNN models by using them fo...
Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs
We approach structured output prediction by optimizing a deep value netw...
Detecting Cancer Metastases on Gigapixel Pathology Images
Each year, the treatment decisions for more than 230,000 breast cancer p...
Bridging the Gap Between Value and Policy Based Reinforcement Learning
We establish a new connection between value and policy based reinforceme...
Pixel Recursive Super Resolution
We present a pixel recursive super resolution model that synthesizes rea...
Neural Combinatorial Optimization with Reinforcement Learning
This paper presents a framework to tackle combinatorial optimization pro...
Improving Policy Gradient by Exploring Underappreciated Rewards
This paper presents a novel form of policy gradient for modelfree reinf...
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Neural Machine Translation (NMT) is an endtoend learning approach for ...
Efficient nongreedy optimization of decision trees
Decision trees and randomized forests are widely used in computer vision...
CO2 Forest: Improved Random Forest by Continuous Optimization of Oblique Splits
We propose a novel algorithm for optimizing multivariate linear threshol...
Fast Exact Search in Hamming Space with MultiIndex Hashing
There is growing interest in representing image data and feature descrip...
