The BigScience Workshop was a value-driven initiative that spanned one a...
Extractive summarization systems are known to produce poorly coherent an...
What are the units of text that we want to model? From bytes to multi-wo...
Softmax is the de facto standard in modern neural networks for language
...
The power of natural language generation models has provoked a flurry of...
We consider the problem of multilingual unsupervised machine translation...
As neural machine translation (NMT) systems become an important part of
...
Summarization systems face the core challenge of identifying and selecti...
There is an ongoing debate in the NLP community whether modern language
...
We release a multilingual neural machine translation model, which can be...
We address the problem of unsupervised abstractive summarization of
coll...
Character-based translation has several appealing advantages, but its
pe...
We present a solution to scale spectral algorithms for learning sequence...
More and more processes governing our lives use in some part an automati...
The Smallest Grammar Problem -- the problem of finding the smallest
cont...
Identifying the language of social media messages is an important first ...
Clustering web documents has numerous applications, such as aggregating ...