The legality of training language models (LMs) on copyrighted or otherwi...
In today's machine learning (ML) models, any part of the training data c...
Large language models are typically trained densely: all parameters are
...
Changing how pre-trained models behave – e.g., improving their performan...
When fine-tuning large neural networks, it is common to use multiple nod...
We present M2D2, a fine-grained, massively multi-domain corpus for study...
We present Branch-Train-Merge (BTM), a communication-efficient algorithm...
We introduce kNN-Prompt, a simple and effective technique to use k-neare...
Language models increasingly rely on massive web dumps for diverse text ...
When an NLP model is trained on text data from one time period and teste...
Research in NLP is often supported by experimental results, and improved...
We introduce a new domain expert mixture (DEMix) layer that enables
cond...
Human evaluations are typically considered the gold standard in natural
...
Language models (LMs) must be both safe and equitable to be responsibly
...
Pretrained neural language models (LMs) are prone to generating racist,
...
Language models pretrained on text from a wide variety of sources form t...
Research in natural language processing proceeds, in part, by demonstrat...
We introduce VAMPIRE, a lightweight pretraining framework for effective ...
Large-scale datasets for natural language inference are created by prese...