Previous research observed accuracy degradation when replacing the atten...
We introduce OpenFlamingo, a family of autoregressive vision-language mo...
Large multimodal datasets have been instrumental in recent breakthroughs...
We introduce new methods for 1) accelerating and 2) stabilizing training...
The transfer learning paradigm of model pre-training and subsequent
fine...
Scaling up neural networks has led to remarkable performance across a wi...
Changing how pre-trained models behave – e.g., improving their performan...
We conduct a large empirical evaluation to investigate the landscape of
...
When fine-tuning large neural networks, it is common to use multiple nod...
Groundbreaking language-vision architectures like CLIP and DALL-E proved...
Open-vocabulary models like CLIP achieve high accuracy across many image...
Web-crawled datasets have enabled remarkable generalization capabilities...
Contrastively trained image-text models such as CLIP, ALIGN, and BASIC h...
Households across the world contain arbitrary objects: from mate gourds ...
The conventional recipe for maximizing model accuracy is to (1) train
mu...
Large pre-trained models such as CLIP offer consistent accuracy across a...
Recent observations have advanced our understanding of the neural networ...
Although sparse neural networks have been studied extensively, the focus...
We present the Supermasks in Superposition (SupSup) model, capable of
se...
Sparsity in Deep Neural Networks (DNNs) is studied extensively with the ...
Training a neural network is synonymous with learning the values of the
...
The success of neural networks has driven a shift in focus from feature
...
Learning is an inherently continuous phenomenon. When humans learn a new...