
-
NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned
We review the EfficientQA competition from NeurIPS 2020. The competition...
read it
-
A Memory Efficient Baseline for Open Domain Question Answering
Recently, retrieval systems based on dense representations have led to i...
read it
-
Distilling Knowledge from Reader to Retriever for Question Answering
The task of information retrieval is an important component of many natu...
read it
-
Beyond English-Centric Multilingual Machine Translation
Existing work in translation demonstrated the potential of massively mul...
read it
-
Self-training Improves Pre-training for Natural Language Understanding
Unsupervised pre-training has led to much recent progress in natural lan...
read it
-
Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering
Generative models for open domain question answering have proven to be c...
read it
-
Training with Quantization Noise for Extreme Model Compression
We tackle the problem of producing compact models, maximizing their accu...
read it
-
Training with Quantization Noise for Extreme Fixed-Point Compression
We tackle the problem of producing compact models, maximizing their accu...
read it
-
Accessing Higher-level Representations in Sequential Transformers with Feedback Memory
Transformers are feedforward networks that can process input tokens in p...
read it
-
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
We study ResNet-, Time-Depth Separable ConvNets-, and Transformer-based ...
read it
-
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
We show that margin-based bitext mining in a multilingual sentence space...
read it
-
Unsupervised Cross-lingual Representation Learning at Scale
This paper shows that pretraining multilingual language models at scale ...
read it
-
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Pre-training text representations have led to significant improvements i...
read it
-
Depth-Adaptive Transformer
State of the art sequence-to-sequence models perform a fixed number of c...
read it
-
Updating Pre-trained Word Vectors and Text Classifiers using Monolingual Alignment
In this paper, we focus on the problem of adapting word vector-based mod...
read it
-
Reducing Transformer Depth on Demand with Structured Dropout
Overparameterized transformer networks have obtained state of the art re...
read it
-
Don't Forget the Long Tail! A Comprehensive Analysis of Morphological Generalization in Bilingual Lexicon Induction
Human translators routinely have to translate rare inflections of words ...
read it
-
Augmenting Self-attention with Persistent Memory
Transformer networks have lead to important progress in language modelin...
read it
-
Misspelling Oblivious Word Embeddings
In this paper we present a method to learn word embeddings that are resi...
read it
-
Adaptive Attention Span in Transformers
We propose a novel self-attention mechanism that can learn its optimal a...
read it
-
Looking for ELMo's friends: Sentence-Level Pretraining Beyond Language Modeling
Work on the problem of contextualized word representation -- the develop...
read it
-
Unsupervised Hyperalignment for Multilingual Word Embeddings
We consider the problem of aligning continuous word representations, lea...
read it
-
Unsupervised Alignment of Embeddings with Wasserstein Procrustes
We consider the task of aligning two sets of points in high dimension, w...
read it
-
Improving Supervised Bilingual Mapping of Word Embeddings
Continuous word representations, learned on different languages, can be ...
read it
-
Lightweight Adaptive Mixture of Neural and N-gram Language Models
It is often the case that the best performing language model is an ensem...
read it
-
Colorless green recurrent networks dream hierarchically
Recurrent neural networks (RNNs) have achieved impressive results in a v...
read it
-
Learning Word Vectors for 157 Languages
Distributed word representations, or word vectors, have recently been ap...
read it
-
Advances in Pre-Training Distributed Word Representations
Many Natural Language Processing applications nowadays rely on pre-train...
read it
-
Unbounded cache model for online language modeling with open vocabulary
Recently, continuous cache models were proposed as extensions to recurre...
read it
-
Fast Linear Model for Knowledge Graph Embeddings
This paper shows that a simple baseline based on a Bag-of-Words (BoW) re...
read it
-
Parseval Networks: Improving Robustness to Adversarial Examples
We introduce Parseval networks, a form of deep neural networks in which ...
read it
-
Improving Neural Language Models with a Continuous Cache
We propose an extension to neural network language models to adapt their...
read it
-
FastText.zip: Compressing text classification models
We consider the problem of producing compact architectures for text clas...
read it
-
Variable Computation in Recurrent Neural Networks
Recurrent neural networks (RNNs) have been used extensively and with inc...
read it
-
Efficient softmax approximation for GPUs
We propose an approximate strategy to efficiently train neural network b...
read it
-
Enriching Word Vectors with Subword Information
Continuous word representations, trained on large unlabeled corpora are ...
read it
-
Bag of Tricks for Efficient Text Classification
This paper explores a simple and efficient baseline for text classificat...
read it
-
Longitudinal Analysis of Discussion Topics in an Online Breast Cancer Community using Convolutional Neural Networks
Identifying topics of discussions in online health communities (OHC) is ...
read it
-
Trace Lasso: a trace norm regularization for correlated designs
Using the ℓ_1-norm to regularize the estimation of the parameter vector ...
read it