
ReMP: Rectified Metric Propagation for FewShot Learning
Fewshot learning features the capability of generalizing from a few exa...
Improving Text Generation with StudentForcing Optimal Transport
Neural language models are often trained with maximum likelihood estimat...
Repulsive Attention: Rethinking Multihead Attention as Bayesian Inference
The neural attention mechanism plays an important role in many natural l...
Robust Conversational AI with Grounded Text Generation
This article presents a hybrid approach based on a Grounded Text Generat...
Weakly supervised crossdomain alignment with optimal transport
Crossdomain alignment between image objects and text sequences is key t...
StructureAware HumanAction Generation
Generating longrange skeletonbased human actions has been a challengin...
SOLOIST: Fewshot TaskOriented Dialog with A Single Pretrained Autoregressive Model
This paper presents a new method SOLOIST, which uses transfer learning t...
POINTER: Constrained Text Generation via Insertionbased Generative Pretraining
Largescale pretrained language models, such as BERT and GPT2, have ac...
Oscar: ObjectSemantics Aligned Pretraining for VisionLanguage Tasks
Largescale pretraining methods of learning crossmodal representations...
Optimus: Organizing Sentences via Pretrained Modeling of a Latent Space
When trained effectively, the Variational Autoencoder (VAE) can be both ...
Feature Quantization Improves GAN Training
The instability in GAN training has been a longstanding problem despite...
MultiView Learning for VisionandLanguage Navigation
Learning to navigate in a visual environment following natural language ...
Survival Cluster Analysis
Conventional survival analysis approaches estimate risk scores or indivi...
Fewshot Natural Language Generation for TaskOriented Dialog
As a crucial component in taskoriented dialog systems, the Natural Lang...
Towards Learning a Generic Agent for VisionandLanguage Navigation via Pretraining
Learning to navigate in a visual environment following naturallanguage ...
StraightThrough Estimator as Projected Wasserstein Gradient Flow
The StraightThrough (ST) estimator is a widely used technique for back...
Robust Navigation with Language Pretraining and Stochastic Sampling
Core to the visionandlanguage navigation (VLN) challenge is building r...
Implicit Deep Latent Variable Models for Text Generation
Deep latent variable models (LVM) such as variational autoencoder (VAE)...
Twin Auxiliary Classifiers GAN
Conditional generative models enjoy remarkable progress over the past fe...
DoubleTransfer at MEDIQA 2019: MultiSource Transfer Learning for Natural Language Understanding in the Medical Domain
This paper describes our competing system to enter the MEDIQA2019 compe...
Towards Amortized RankingCritical Training for Collaborative Filtering
Collaborative filtering is widely used in modern recommender systems. Re...
Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing
Variational autoencoders (VAEs) with an autoregressive decoder have bee...
Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
The posteriors over neural network weights are high dimensional and mult...
Adversarial Learning of a Sampler Based on an Unnormalized Distribution
We investigate adversarial learning in the case when only an unnormalize...
Generative Adversarial Network Training is a Continual Learning Problem
Generative Adversarial Networks (GANs) have proven to be a powerful fram...
Policy Optimization as Wasserstein Gradient Flows
Policy optimization is a core component of reinforcement learning (RL), ...
Baseline Needs More Love: On Simple WordEmbeddingBased Models and Associated Pooling Mechanisms
Many deep learning architectures have been proposed to model the composi...
Joint Embedding of Words and Labels for Text Classification
Word embeddings are effective intermediate representations for capturing...
Measuring the Intrinsic Dimension of Objective Landscapes
Many recently trained neural networks employ large numbers of parameters...
Adversarial TimetoEvent Modeling
Modern health data science applications leverage abundant molecular and ...
Learning Structural Weight Uncertainty for Sequential DecisionMaking
Learning probability distributions on the weights of neural networks (NN...
Triangle Generative Adversarial Networks
A Triangle Generative Adversarial Network (ΔGAN) is developed for semi...
Symmetric Variational Autoencoder and Connections to Adversarial Learning
A new form of the variational autoencoder (VAE) is proposed, based on th...
ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching
We investigate the nonidentifiability issues associated with bidirectio...
ContinuousTime Flows for Deep Generative Models
Normalizing flows have been developed recently as a method for drawing s...
Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling
Recurrent neural networks (RNNs) have shown promising performance for la...
Learning Generic Sentence Representations Using Convolutional Neural Networks
We propose a new encoderdecoder approach to learn distributed sentence ...
Unsupervised Learning with Truncated Gaussian Graphical Models
Gaussian graphical models (GGMs) are widely used for statistical modelin...
Stochastic Gradient MCMC with Stale Gradients
Stochastic gradient MCMC (SGMCMC) has played an important role in large...
Variational Autoencoder for Deep Learning of Images, Labels and Captions
A novel variational autoencoder is developed to model images, as well as...
Bridging the Gap between Stochastic Gradient MCMC and Stochastic Optimization
Stochastic gradient Markov chain Monte Carlo (SGMCMC) methods are Bayes...
Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks
Effective training of deep neural networks suffers from two main issues....
HighOrder Stochastic Gradient Thermostats for Bayesian Learning of Deep Models
Learning in deep models using Bayesian methods has generated significant...
A Deep Generative Deconvolutional Image Model
A deep generative model is developed for representation and analysis of ...
Deep Temporal Sigmoid Belief Networks for Sequence Modeling
Deep dynamic generative models are developed to learn sequential depende...
Chunyuan Li
Research Assistant at Duke University since 2014, Research Intern at UBER 2017, Research Intern at ADOBE 2016, Guest Researcher at National Institute of Standards and Technology from 20132014, Research Intern at INRIA 2013.