b'Naman Goyal'

research

∙ 08/31/2023

The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants

We present Belebele, a multiple-choice machine reading comprehension (MR...

0 Lucas Bandarkar, et al. ∙

research

∙ 07/18/2023

Llama 2: Open Foundation and Fine-Tuned Chat Models

In this work, we develop and release Llama 2, a collection of pretrained...

0 Hugo Touvron, et al. ∙

research

∙ 04/19/2023

A Theory on Adam Instability in Large-Scale Machine Learning

We present a theory for the previously unexplained divergent behavior no...

0 Igor Molybog, et al. ∙

research

∙ 02/27/2023

LLaMA: Open and Efficient Foundation Language Models

We introduce LLaMA, a collection of foundation language models ranging f...

6 Hugo Touvron, et al. ∙

research

∙ 01/25/2023

XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models

Large multilingual language models typically rely on a single vocabulary...

3 Davis Liang, et al. ∙

research

∙ 01/10/2023

Scaling Laws for Generative Mixed-Modal Language Models

Generative language models define distributions over sequences of tokens...

0 Armen Aghajanyan, et al. ∙

research

∙ 10/20/2022

A survey on Self Supervised learning approaches for improving Multimodal representation learning

Recently self supervised learning has seen explosive growth and use in v...

0 Naman Goyal, et al. ∙

research

∙ 08/05/2022

BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

We present BlenderBot 3, a 175B parameter dialogue model capable of open...

16 Kurt Shuster, et al. ∙

research

∙ 05/24/2022

On the Role of Bidirectionality in Language Model Pre-Training

Prior work on language model pre-training has explored different archite...

0 Mikel Artetxe, et al. ∙

research

∙ 05/12/2022

Lifting the Curse of Multilinguality by Pre-training Modular Transformers

Multilingual pre-trained models are known to suffer from the curse of mu...

4 Jonas Pfeiffer, et al. ∙

research

∙ 05/02/2022

OPT: Open Pre-trained Transformer Language Models

Large language models, which are often trained for hundreds of thousands...

8 Susan Zhang, et al. ∙

research

∙ 04/29/2022

How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?

A multilingual tokenizer is a fundamental component of multilingual neur...

2 Shiyue Zhang, et al. ∙

research

∙ 03/07/2022

Graph Neural Networks for Image Classification and Reinforcement Learning using Graph representations

In this paper, we will evaluate the performance of graph neural networks...

0 Naman Goyal, et al. ∙

research

∙ 01/19/2022

CM3: A Causal Masked Multimodal Model of the Internet

We introduce CM3, a family of causally masked generative models trained ...

8 Armen Aghajanyan, et al. ∙

research

∙ 12/20/2021

Efficient Large Scale Language Modeling with Mixtures of Experts

Mixture of Experts layers (MoEs) enable efficient scaling of language mo...

10 Mikel Artetxe, et al. ∙

research

∙ 12/20/2021

Few-shot Learning with Multilingual Language Models

Large-scale autoregressive language models such as GPT-3 are few-shot le...

8 Xi Victoria Lin, et al. ∙

research

∙ 11/17/2021

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

This paper presents XLS-R, a large-scale model for cross-lingual speech ...

0 Arun Babu, et al. ∙

research

∙ 07/14/2021

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task

In this paper, we describe our end-to-end multilingual speech translatio...

7 Yun Tang, et al. ∙

research

∙ 06/06/2021

The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

One of the biggest challenges hindering progress in low-resource and mul...

0 Naman Goyal, et al. ∙

research

∙ 05/31/2021

Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel Data

The scarcity of parallel data is a major obstacle for training high-qual...

7 Wei-Jen Ko, et al. ∙

research

∙ 05/02/2021

Larger-Scale Transformers for Multilingual Masked Language Modeling

Recent work has demonstrated the effectiveness of cross-lingual language...

0 Naman Goyal, et al. ∙

research

∙ 03/30/2021

BASE Layers: Simplifying Training of Large, Sparse Models

We introduce a new balanced assignment of experts (BASE) layer for large...

0 Mike Lewis, et al. ∙

research

∙ 03/23/2021

Multilingual Autoregressive Entity Linking

We present mGENRE, a sequence-to-sequence system for the Multilingual En...

0 Nicola De Cao, et al. ∙

research

∙ 11/16/2020

Facebook AI's WMT20 News Translation Task Submission

This paper describes Facebook AI's submission to WMT20 shared news trans...

0 Peng-Jen Chen, et al. ∙

research

∙ 10/21/2020

Beyond English-Centric Multilingual Machine Translation

Existing work in translation demonstrated the potential of massively mul...

11 Angela Fan, et al. ∙

research

∙ 08/06/2020

Better Fine-Tuning by Reducing Representational Collapse

Although widely adopted, existing approaches for fine-tuning pre-trained...

2 Armen Aghajanyan, et al. ∙

research

∙ 08/02/2020

Multilingual Translation with Extensible Multilingual Pretraining and Finetuning

Recent work demonstrates the potential of multilingual pretraining of cr...

0 Yuqing Tang, et al. ∙

research

∙ 05/22/2020

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Large pre-trained language models have been shown to store factual knowl...

0 Patrick Lewis, et al. ∙

research

∙ 04/28/2020

Recipes for building an open-domain chatbot

Building open-domain chatbots is a challenging area for machine learning...

6 Stephen Roller, et al. ∙

research

∙ 01/22/2020

Multilingual Denoising Pre-training for Neural Machine Translation

This paper demonstrates that multilingual denoising pre-training produce...

0 Yinhan Liu, et al. ∙

research

∙ 11/05/2019

Unsupervised Cross-lingual Representation Learning at Scale

This paper shows that pretraining multilingual language models at scale ...

0 Alexis Conneau, et al. ∙

research

∙ 10/29/2019

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

We present BART, a denoising autoencoder for pretraining sequence-to-seq...

35 Mike Lewis, et al. ∙

research

∙ 07/26/2019

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Language model pretraining has led to significant performance gains but ...

0 Yinhan Liu, et al. ∙

research

∙ 09/07/2016

The Social Dynamics of Language Change in Online Networks

Language change is a complex social phenomenon, revealing pathways of co...

0 Rahul Goel, et al. ∙

Naman Goyal

Featured Co-authors

Sign in with Google

Consider DeepAI Pro