
-
Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding
Transformer has become ubiquitous in the deep learning field. One of the...
read it
-
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
We present VILLA, the first known effort on large-scale adversarial trai...
read it
-
Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
Recent Transformer-based large-scale pre-trained models have revolutioni...
read it
-
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
We present HERO, a Hierarchical EncodeR for Omni-representation learning...
read it
-
Distilling the Knowledge of BERT for Text Generation
Large-scale pre-trained language model, such as BERT, has recently achie...
read it
-
DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation
We present a large, tunable neural conversational response generation mo...
read it
-
UNITER: Learning UNiversal Image-TExt Representations
Joint image-text embedding is the bedrock for most Vision-and-Language (...
read it
-
Explore, Propose, and Assemble: An Interpretable Model for Multi-Hop Reading Comprehension
Multi-hop reading comprehension requires the model to explore and connec...
read it
-
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
Inspired by how humans summarize long documents, we propose an accurate ...
read it