Scaling laws have been recently employed to derive compute-optimal model...
The observation and description of collective excitations in solids is a...
There has been a recent explosion of computer vision models which perfor...
We propose a simple pairwise sigmoid loss for image-text pre-training. U...
Misalignment between model predictions and intended usage can be detrime...
Vision Transformers convert images to sequences by slicing them into pat...
Effective scaling and a flexible task interface enable large language mo...
We introduce UViM, a unified approach capable of modeling a wide range o...
It is commonly accepted that the Vision Transformer model requires
sophi...
We consider the problem of revenue-maximizing Bayesian auction design wi...
This paper presents contrastive-tuning, a simple method employing contra...
Vision Transformers (ViT) have been shown to attain highly competitive
p...
There is a growing discrepancy in computer vision between large-scale mo...
Attention-based neural networks such as the Vision Transformer (ViT) hav...
Convolutional Neural Networks (CNNs) are the go-to model for computer vi...
Before deploying machine learning models it is critical to assess their
...
While the Transformer architecture has become the de-facto standard for
...
Modern deep convolutional networks (CNNs) are often criticized for not
g...
Yes, and no. We ask whether recent progress on the ImageNet classificati...
Transfer of pre-trained representations improves sample efficiency and
s...
This work tackles the problem of semi-supervised learning of image
class...
Unsupervised visual representation learning remains a largely unsolved
p...
In this paper we propose a new model for detecting visual relationships....
We develop a probabilistic technique for colorizing grayscale natural im...
We study probabilistic models of natural images and extend the autoregre...
A major open problem on the road to artificial intelligence is the
devel...
We introduce a new loss function for the weakly-supervised training of
s...
Challenging computer vision tasks, in particular semantic image segmenta...