Visual document understanding is a complex task that involves analyzing ...
A big convergence of language, multimodal perception, action, and world
...
Language models have steadily increased in size over the past few years....
Position modeling plays a critical role in Transformers. In this paper, ...
Large Transformers have achieved state-of-the-art performance across man...
Language models (LMs) are becoming the foundation for almost all major
l...
In this paper, we elaborate upon recipes for building multilingual
repre...
A critical component of a successful language generation pipeline is the...
A big convergence of model architectures across language, vision, speech...
A multilingual tokenizer is a fundamental component of multilingual neur...
Neural Machine Translation (NMT) models are typically trained on
heterog...
We aim to investigate the performance of current OCR systems on low reso...
Large-scale autoregressive language models such as GPT-3 are few-shot
le...
Recent work in multilingual machine translation (MMT) has focused on the...
Sentence-level Quality estimation (QE) of machine translation is
traditi...
Cross-lingual document representations enable language understanding in
...
One of the biggest challenges hindering progress in low-resource and
mul...
The scarcity of parallel data is a major obstacle for training high-qual...
Pretrained multilingual models are able to perform cross-lingual transfe...
Quality estimation aims to measure the quality of translated content wit...
Existing work in translation demonstrated the potential of massively
mul...
We present MLQE-PE, a new dataset for Machine Translation (MT) Quality
E...
Unsupervised pre-training has led to much recent progress in natural lan...
Recent work demonstrates the potential of multilingual pretraining of
cr...
Quality Estimation (QE) is an important component in making Machine
Tran...
Cross-lingual document alignment aims to identify pairs of documents in ...
This paper shows that pretraining multilingual language models at scale ...
Pre-training text representations have led to significant improvements i...
This paper describes Facebook AI's submission to the WAT 2019 Myanmar-En...
We present an approach based on multilingual sentence embeddings to
auto...
In this paper, we describe our submission to the WMT19 low-resource para...
The vast majority of language pairs in the world are low-resource becaus...