b'Edouard Grave'

research

∙ 02/27/2023

LLaMA: Open and Efficient Foundation Language Models

We introduce LLaMA, a collection of foundation language models ranging f...

6 Hugo Touvron, et al. ∙

research

∙ 02/15/2023

Augmented Language Models: a Survey

This survey reviews works in which language models (LMs) are augmented w...

0 Grégoire Mialon, et al. ∙

research

∙ 09/27/2022

EditEval: An Instruction-Based Benchmark for Text Improvements

Evaluation of text generation to date has primarily focused on content c...

15 Jane Dwivedi-Yu, et al. ∙

research

∙ 08/24/2022

PEER: A Collaborative Language Model

Textual content is often the output of a collaborative writing process: ...

11 Timo Schick, et al. ∙

research

∙ 08/05/2022

Few-shot Learning with Retrieval Augmented Language Models

Large language models have shown impressive few-shot results on a wide r...

10 Gautier Izacard, et al. ∙

research

∙ 07/08/2022

Improving Wikipedia Verifiability with AI

Verifiability is a core content policy of Wikipedia: claims that are lik...

12 Fabio Petroni, et al. ∙

research

∙ 01/29/2022

Flashlight: Enabling Innovation in Tools for Machine Learning

As the computational requirements for machine learning systems and the s...

0 Jacob Kahn, et al. ∙

research

∙ 12/20/2021

Are Large-scale Datasets Necessary for Self-Supervised Pre-training?

Pre-training models on large scale datasets, like ImageNet, is a standar...

25 Alaaeldin El-Nouby, et al. ∙

research

∙ 12/18/2021

The Web Is Your Oyster – Knowledge-Intensive NLP against a Very Large Web Corpus

In order to address the increasing demands of real-world applications, t...

0 Aleksandra Piktus, et al. ∙

research

∙ 12/16/2021

Towards Unsupervised Dense Information Retrieval with Contrastive Learning

Information retrieval is an important component in natural language proc...

0 Gautier Izacard, et al. ∙

research

∙ 05/07/2021

ResMLP: Feedforward networks for image classification with data-efficient training

We present ResMLP, an architecture built entirely upon multi-layer perce...

43 Hugo Touvron, et al. ∙

research

∙ 01/01/2021

NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

We review the EfficientQA competition from NeurIPS 2020. The competition...

15 Sewon Min, et al. ∙

research

∙ 12/30/2020

A Memory Efficient Baseline for Open Domain Question Answering

Recently, retrieval systems based on dense representations have led to i...

0 Gautier Izacard, et al. ∙

research

∙ 12/08/2020

Distilling Knowledge from Reader to Retriever for Question Answering

The task of information retrieval is an important component of many natu...

0 Gautier Izacard, et al. ∙

research

∙ 10/21/2020

Beyond English-Centric Multilingual Machine Translation

Existing work in translation demonstrated the potential of massively mul...

11 Angela Fan, et al. ∙

research

∙ 10/05/2020

Self-training Improves Pre-training for Natural Language Understanding

Unsupervised pre-training has led to much recent progress in natural lan...

0 Jingfei Du, et al. ∙

research

∙ 07/02/2020

Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering

Generative models for open domain question answering have proven to be c...

0 Gautier Izacard, et al. ∙

research

∙ 04/15/2020

Training with Quantization Noise for Extreme Model Compression

We tackle the problem of producing compact models, maximizing their accu...

30 Angela Fan, et al. ∙

research

∙ 04/15/2020

Training with Quantization Noise for Extreme Fixed-Point Compression

We tackle the problem of producing compact models, maximizing their accu...

3 Angela Fan, et al. ∙

research

∙ 02/21/2020

Accessing Higher-level Representations in Sequential Transformers with Feedback Memory

Transformers are feedforward networks that can process input tokens in p...

5 Angela Fan, et al. ∙

research

∙ 11/19/2019

End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures

We study ResNet-, Time-Depth Separable ConvNets-, and Transformer-based ...

0 Gabriel Synnaeve, et al. ∙

research

∙ 11/10/2019

CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB

We show that margin-based bitext mining in a multilingual sentence space...

0 Holger Schwenk, et al. ∙

research

∙ 11/05/2019

Unsupervised Cross-lingual Representation Learning at Scale

This paper shows that pretraining multilingual language models at scale ...

0 Alexis Conneau, et al. ∙

research

∙ 11/01/2019

CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data

Pre-training text representations have led to significant improvements i...

13 Guillaume Wenzek, et al. ∙

research

∙ 10/22/2019

Depth-Adaptive Transformer

State of the art sequence-to-sequence models perform a fixed number of c...

0 Maha Elbayad, et al. ∙

research

∙ 10/14/2019

Updating Pre-trained Word Vectors and Text Classifiers using Monolingual Alignment

In this paper, we focus on the problem of adapting word vector-based mod...

0 Piotr Bojanowski, et al. ∙

research

∙ 09/25/2019

Reducing Transformer Depth on Demand with Structured Dropout

Overparameterized transformer networks have obtained state of the art re...

10 Angela Fan, et al. ∙

research

∙ 09/06/2019

Don't Forget the Long Tail! A Comprehensive Analysis of Morphological Generalization in Bilingual Lexicon Induction

Human translators routinely have to translate rare inflections of words ...

0 Paula Czarnowska, et al. ∙

research

∙ 07/02/2019

Augmenting Self-attention with Persistent Memory

Transformer networks have lead to important progress in language modelin...

6 Sainbayar Sukhbaatar, et al. ∙

research

∙ 05/23/2019

Misspelling Oblivious Word Embeddings

In this paper we present a method to learn word embeddings that are resi...

0 Bora Edizel, et al. ∙

research

∙ 05/19/2019

Adaptive Attention Span in Transformers

We propose a novel self-attention mechanism that can learn its optimal a...

0 Sainbayar Sukhbaatar, et al. ∙

research

∙ 12/28/2018

Looking for ELMo's friends: Sentence-Level Pretraining Beyond Language Modeling

Work on the problem of contextualized word representation -- the develop...

0 Samuel R. Bowman, et al. ∙

research

∙ 11/02/2018

Unsupervised Hyperalignment for Multilingual Word Embeddings

We consider the problem of aligning continuous word representations, lea...

0 Jean Alaux, et al. ∙

research

∙ 05/29/2018

Unsupervised Alignment of Embeddings with Wasserstein Procrustes

We consider the task of aligning two sets of points in high dimension, w...

2 Edouard Grave, et al. ∙

research

∙ 04/20/2018

Improving Supervised Bilingual Mapping of Word Embeddings

Continuous word representations, learned on different languages, can be ...

0 Armand Joulin, et al. ∙

research

∙ 04/20/2018

Lightweight Adaptive Mixture of Neural and N-gram Language Models

It is often the case that the best performing language model is an ensem...

0 Anton Bakhtin, et al. ∙

research

∙ 03/29/2018

Colorless green recurrent networks dream hierarchically

Recurrent neural networks (RNNs) have achieved impressive results in a v...

0 Kristina Gulordava, et al. ∙

research

∙ 02/19/2018

Learning Word Vectors for 157 Languages

Distributed word representations, or word vectors, have recently been ap...

0 Edouard Grave, et al. ∙

research

∙ 12/26/2017

Advances in Pre-Training Distributed Word Representations

Many Natural Language Processing applications nowadays rely on pre-train...

0 Tomas Mikolov, et al. ∙

research

∙ 11/07/2017

Unbounded cache model for online language modeling with open vocabulary

Recently, continuous cache models were proposed as extensions to recurre...

0 Edouard Grave, et al. ∙

research

∙ 10/30/2017

Fast Linear Model for Knowledge Graph Embeddings

This paper shows that a simple baseline based on a Bag-of-Words (BoW) re...

0 Armand Joulin, et al. ∙

research

∙ 04/28/2017

Parseval Networks: Improving Robustness to Adversarial Examples

We introduce Parseval networks, a form of deep neural networks in which ...

0 Moustapha Cisse, et al. ∙

research

∙ 12/13/2016

Improving Neural Language Models with a Continuous Cache

We propose an extension to neural network language models to adapt their...

0 Edouard Grave, et al. ∙

research

∙ 12/12/2016

FastText.zip: Compressing text classification models

We consider the problem of producing compact architectures for text clas...

0 Armand Joulin, et al. ∙

research

∙ 11/18/2016

Variable Computation in Recurrent Neural Networks

Recurrent neural networks (RNNs) have been used extensively and with inc...

0 Yacine Jernite, et al. ∙

research

∙ 09/14/2016

Efficient softmax approximation for GPUs

We propose an approximate strategy to efficiently train neural network b...

0 Edouard Grave, et al. ∙

research

∙ 07/15/2016

Enriching Word Vectors with Subword Information

Continuous word representations, trained on large unlabeled corpora are ...

0 Piotr Bojanowski, et al. ∙

research

∙ 07/06/2016

Bag of Tricks for Efficient Text Classification

This paper explores a simple and efficient baseline for text classificat...

0 Armand Joulin, et al. ∙

research

∙ 03/28/2016

Longitudinal Analysis of Discussion Topics in an Online Breast Cancer Community using Convolutional Neural Networks

Identifying topics of discussions in online health communities (OHC) is ...

0 Shaodian Zhang, et al. ∙

research

∙ 09/09/2011

Trace Lasso: a trace norm regularization for correlated designs

Using the ℓ_1-norm to regularize the estimation of the parameter vector ...

0 Edouard Grave, et al. ∙

Edouard Grave

Featured Co-authors

Sign in with Google

Consider DeepAI Pro