
StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
There are two major classes of natural language grammars – the dependenc...
Gradient Starvation: A Learning Proclivity in Neural Networks
We identify and formalize a fundamental gradient descent phenomenon resu...
Unsupervised Learning of Dense Visual Representations
Contrastive selfsupervised learning has emerged as a promising approach...
NUGAN: High resolution neural upsampling with GAN
In this paper, we propose NUGAN, a new method for resampling audio from...
Neural Approximate Sufficient Statistics for Implicit Models
We consider the fundamental problem of how to automatically construct su...
Recursive TopDown Production for Sentence Generation with Latent Trees
We model the recursive production property of contextfree grammars for ...
Supervised Seeded Iterated Learning for Interactive Language Learning
Language drift has been one of the major obstacles to train language mod...
Integrating Categorical Semantics into Unsupervised Domain Translation
While unsupervised domain translation (UDT) has seen a lot of success re...
DataEfficient Reinforcement Learning with Momentum Predictive Representations
While deep reinforcement learning excels at solving tasks where large am...
Generative Graph Perturbations for Scene Graph Prediction
Inferring objects and their relationships from an image is useful in man...
ARDAE: Towards Unbiased Neural Entropy Gradient Estimation
Entropy is ubiquitous in machine learning, but it is in general intracta...
Graph DensityAware Losses for Novel Compositions in Scene Graph Generation
Scene graph generation (SGG) aims to predict graphstructured descriptio...
A LargeScale, OpenDomain, MixedInterface DialogueBased ITS for STEM
We present Korbit, a largescale, opendomain, mixedinterface, dialogue...
Countering Language Drift with Seeded Iterated Learning
Supervised learning methods excel at capturing statistical properties of...
Pix2Shape – Towards Unsupervised Learning of 3D Scenes from Images using a Viewbased Representation
We infer and generate threedimensional (3D) scene information from a si...
OutofDistribution Generalization via Risk Extrapolation (REx)
Generalizing outside of the training distribution is an open challenge f...
Augmented Normalizing Flows: Bridging the Gap Between Generative Flows and Latent Variable Models
In this work, we propose a new family of generative flows on an augmente...
CLOSURE: Assessing Systematic Generalization of CLEVR Models
The CLEVR dataset of naturallooking questions about 3Drendered scenes ...
Selective Brain Damage: Measuring the Disparate Impact of Model Pruning
Neural network pruning techniques have demonstrated it is possible to re...
Ordered Memory
Stackaugmented recurrent neural networks (RNNs) have been of interest t...
Icentia11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery
We release the largest public ECG dataset of continuous raw signals for ...
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Previous works <cit.> have found that generating coherent raw audio wave...
No Press Diplomacy: Modeling MultiAgent Gameplay
Diplomacy is a sevenplayer nonstochastic, noncooperative game, where ...
VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering
Embodied Question Answering (EQA) is a recently proposed task, where an ...
Detecting semantic anomalies
We critically appraise the recent interest in outofdistribution (OOD) ...
Benchmarking BonusBased Exploration Methods on the Arcade Learning Environment
This paper provides an empirical evaluation of recently developed explor...
Adversarial Computation of Optimal Transport Maps
Computing optimal transport maps between highdimensional and continuous...
Investigating Biases in Textual Entailment Datasets
The ability to understand logical relationships between sentences is an ...
Stochastic Neural Network with Kronecker Flow
Recent advances in variational inference enable the modelling of highly ...
Note on the bias and variance of variational inference
In this note, we study the relationship between the variational gap and ...
Batch weight for domain adaptation with mass shift
Unsupervised domain transfer is the task of transferring or translating ...
Hierarchical Importance Weighted Autoencoders
Importance weighted variational inference (Burda et al., 2015) uses mult...
Improved Conditional VRNNs for Video Prediction
Predicting future frames for a video sequence is a challenging generativ...
Counterpoint by Convolution
Machine learning models of music typically break up the task of composit...
Maximum Entropy Generators for EnergyBased Models
Unsupervised learning is about capturing dependencies between variables ...
Deep Generative Modeling of LiDAR Data
Building models capable of generating structured output is a key challen...
Systematic Generalization: What Is Required and Can It Be Learned?
Numerous models for grounded language understanding have been recently p...
Planning in Dynamic Environments with Conditional Autoregressive Models
We demonstrate the use of conditional autoregressive generative models (...
Harmonic Recomposition using Conditional Autoregressive Modeling
We demonstrate a conditional autoregressive pipeline for efficient music...
Representation Mixing for TTS Synthesis
Recent character and phonemebased parametric TTS systems using deep lea...
Blindfold Baselines for Embodied QA
We explore blindfold (questiononly) baselines for Embodied Question Ans...
Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks
Recurrent neural network (RNN) models are widely used for processing seq...
On the Learning Dynamics of Deep Neural Networks
While a lot of progress has been made in recent years, the dynamics of l...
Improving Explorability in Variational Inference with Annealed Variational Objectives
Despite the advances in the representational capacity of approximate dis...
Approximate Exploration through State Abstraction
Although exploration in reinforcement learning is well understood from a...
Visual Reasoning with Multihop Feature Modulation
Recent breakthroughs in computer vision and natural language processing ...
On the Spectral Bias of Deep Neural Networks
It is well known that overparametrized deep neural networks (DNNs) are ...
Learning Distributed Representations from Reviews for Collaborative Filtering
Recent work has shown that collaborative filterbased recommender system...
Manifold Mixup: Encouraging Meaningful OnManifold Interpolation as a Regularizer
Deep networks often perform well on the data manifold on which they are ...
Straight to the Tree: Constituency Parsing with Neural Syntactic Distance
In this work, we propose a novel constituency parsing scheme. The model ...
