
-
Meta Back-translation
Back-translation is an effective strategy to improve the performance of ...
read it
-
Handling Noisy Labels via One-Step Abductive Multi-Target Learning
Learning from noisy labels is an important concern because of the lack o...
read it
-
Rethinking Transformer-based Set Prediction for Object Detection
DETR is a recently proposed Transformer-based method which views object ...
read it
-
On the Sentence Embeddings from Pre-trained Language Models
Pre-trained contextual representations like BERT have achieved great suc...
read it
-
Neural Language Modeling for Contextualized Temporal Graph Generation
This paper presents the first study on using large-scale pre-trained lan...
read it
-
JAKET: Joint Pre-training of Knowledge Graph and Language Understanding
Knowledge graphs (KGs) contain rich information about world knowledge, e...
read it
-
Unsupervised Parallel Corpus Mining on Web Data
With a large amount of parallel data, neural machine translation systems...
read it
-
Kernel Stein Generative Modeling
We are interested in gradient-based Explicit Generative Modeling where s...
read it
-
An EM Approach to Non-autoregressive Conditional Sequence Generation
Autoregressive (AR) models have been the dominating approach to conditio...
read it
-
Generalized Multi-Relational Graph Convolution Network
Graph Convolutional Networks (GCNs) have received increasing attention i...
read it
-
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
With the success of language pretraining, it is highly desirable to deve...
read it
-
Predicting Performance for Natural Language Processing Tasks
Given the complexity of combinations of tasks, languages, and domains in...
read it
-
Politeness Transfer: A Tag and Generate Approach
This paper introduces a new task of politeness transfer which involves c...
read it
-
Practical Comparable Data Collection for Low-Resource Languages via Images
We propose a method of curating high-quality comparable training data fo...
read it
-
Explainable Unsupervised Change-point Detection via Graph Neural Networks
Change-point detection (CPD) aims at detecting the abrupt property chang...
read it
-
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
Natural Language Processing (NLP) has recently achieved great success by...
read it
-
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
We introduce a new task, Video-and-Language Inference, for joint multimo...
read it
-
An Algorithm for Computing a Minimal Comprehensive Gröbner Basis of a Parametric Polynomial System
An algorithm to generate a minimal comprehensive Gröbner basis of a par...
read it
-
Pre-training Tasks for Embedding-based Large-scale Retrieval
We consider the large-scale query-document retrieval problem: given a qu...
read it
-
Graph-Revised Convolutional Network
Graph Convolutional Networks (GCNs) have received increasing attention i...
read it
-
A Re-evaluation of Knowledge Graph Completion Methods
Knowledge Graph Completion (KGC) aims at automatically predicting missin...
read it
-
XL-Editor: Post-editing Sentences with XLNet
While neural sequence generation models achieve initial success for many...
read it
-
Active Learning for Graph Neural Networks via Node Feature Propagation
Graph Neural Networks (GNNs) for prediction tasks like node classificati...
read it
-
Cross-lingual Alignment vs Joint Training: A Comparative Study and A Simple Unified Framework
Learning multilingual representations of text has proven a successful me...
read it
-
A Surprisingly Effective Fix for Deep Latent Variable Modeling of Text
When trained effectively, the Variational Autoencoder (VAE) is both a po...
read it
-
XLNet: Generalized Autoregressive Pretraining for Language Understanding
With the capability of modeling bidirectional contexts, denoising autoen...
read it
-
A Modular Deep Learning Approach for Extreme Multi-label Text Classification
Extreme multi-label classification (XMC) aims to assign to an instance t...
read it
-
Implicit Kernel Learning
Kernels are powerful and versatile tools in machine learning and statist...
read it
-
The ARIEL-CMU Systems for LoReHLT18
This paper describes the ARIEL-CMU submissions to the Low Resource Human...
read it
-
Re-examination of the Role of Latent Variables in Sequence Modeling
With latent variables, stochastic recurrent models have achieved state-o...
read it
-
An Adversarial Approach to High-Quality, Sentiment-Controlled Neural Dialogue Generation
In this work, we propose a method for neural dialogue response generatio...
read it
-
Kernel Change-point Detection with Auxiliary Deep Generative Models
Detecting the emergence of abrupt property changes in time series is a c...
read it
-
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer networks have a potential of learning longer-term dependency...
read it
-
Switch-based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning
Training task-completion dialogue agents with reinforcement learning usu...
read it
-
Unsupervised Cross-lingual Transfer of Word Embedding Spaces
Cross-lingual transfer of word embeddings aims to establish the semantic...
read it
-
DARTS: Differentiable Architecture Search
This paper addresses the scalability challenge of architecture search by...
read it
-
Stochastic WaveNet: A Generative Latent Variable Model for Sequential Data
How to model distribution of sequential data, including but not limited ...
read it
-
Likelihood Almost Free Inference Networks
Variational inference for latent variable models is prevalent in various...
read it
-
Learning Graph Convolution Filters from Data Manifold
Convolution Neural Network (CNN) has gained tremendous success in comput...
read it
-
MMD GAN: Towards Deeper Understanding of Moment Matching Network
Generative moment matching network (GMMN) is a deep generative model tha...
read it
-
Data-driven Random Fourier Features using Stein Effect
Large-scale kernel approximation is an important problem in machine lear...
read it
-
Analogical Inference for Multi-Relational Embeddings
Large-scale multi-relational embedding refers to the task of learning th...
read it
-
Cross-lingual Distillation for Text Classification
Cross-lingual text classification(CLTC) is the task of classifying docum...
read it
-
RACE: Large-scale ReAding Comprehension Dataset From Examinations
We present RACE, a new dataset for benchmark evaluation of methods in th...
read it
-
Co-Clustering for Multitask Learning
This paper presents a new multitask learning framework that learns a sha...
read it