Currently, most machine learning models are trained by centralized teams...
Sparsely activated neural networks with conditional computation learn to...
Transfer learning - i.e., further fine-tuning a pre-trained model on a
d...
The current trend of scaling language models involves increasing both
pa...
Research on neural networks has largely focused on understanding a singl...
Pretraining has been shown to scale well with compute, data size and dat...
While large language models (LLMs) have proven to be effective on a larg...
The internet contains a wealth of knowledge – from the birthdays of
hist...
Multitask prompted finetuning (MTF) has been shown to help large languag...
The crystallization of modeling methods around the Transformer architect...
Deep learning models struggle with compositional generalization, i.e. th...
The NP-hard problem of optimizing a shallow ReLU network can be characte...
Large language models such as GPT-3 (Brown et al., 2020) can perform
arb...
Many NLP tasks benefit from using large language models (LLMs) that ofte...
Getting the most out of limited resources allows advances in natural lan...
Few-shot in-context learning (ICL) enables pre-trained language models t...
Large pretrained Transformer language models have been shown to exhibit
...
Recent neural network-based language models have benefited greatly from
...
Past work has shown that large language models are susceptible to privac...
PromptSource is a system for creating, sharing, and using natural langua...
What are the units of text that we want to model? From bytes to multi-wo...
During typical gradient-based training of deep neural networks, all of t...
Transfer learning provides a way of leveraging knowledge from one task w...
Large language models have recently been shown to attain reasonable zero...
NLP has achieved great progress in the past decade through the use of ne...
Many recent developments on generative models for natural images have re...
Most widely-used pre-trained language models operate on sequences of tok...
Recently, pre-trained language models (LMs) have achieved strong perform...
The research community has proposed copious modifications to the Transfo...
We review the EfficientQA competition from NeurIPS 2020. The competition...
It has become common to publish large (billion parameter) language model...
The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified
...
Neural networks have recently achieved human-level performance on variou...
There has been an ongoing cycle where stronger defenses against adversar...
We introduce a simple (one line of code) modification to the Generative
...
It has recently been observed that neural language models trained on
uns...
Semi-supervised learning (SSL) provides an effective means of leveraging...
For many evaluation metrics commonly used as benchmarks for unconditiona...
We improve the recently-proposed "MixMatch" semi-supervised learning
alg...
Transfer learning, where a model is first pre-trained on a data-rich tas...
Adversarial examples raise questions about whether neural network models...
Simultaneous machine translation begins to translate each source sentenc...
Semi-supervised learning has proven to be a powerful paradigm for levera...
Adversarial examples are inputs to machine learning models designed by a...
Lingvo is a Tensorflow framework offering a complete solution for
collab...
Autoencoders provide a powerful framework for learning compressed
repres...
Discovering and exploring the underlying structure of multi-instrumental...
Semi-supervised learning (SSL) provides a powerful framework for leverag...
The Variational Autoencoder (VAE) has proven to be an effective model fo...
Recent work (Pennington et al, 2017) suggests that controlling the entir...