DreamFusion has recently demonstrated the utility of a pre-trained
text-...
This paper summarizes model improvements and inference-time optimization...
Although neural radiance fields (NeRF) have shown impressive advances fo...
While language tasks are naturally expressed in a single, unified, model...
Thin, reflective objects such as forks and whisks are common in our dail...
We design an open-vocabulary image segmentation model to organize an ima...
This work presents a simple vision transformer design as a strong baseli...
Despite the fast progress in training specialized models for various tas...
3D perception of object shapes from RGB image input is fundamental towar...
Object proposals have become an integral preprocessing steps of many vis...
Does having visual priors (e.g. the ability to detect objects) facilitat...
The speed-accuracy Pareto curve of object detection systems have advance...
Recent advances in image synthesis enables one to translate images by
le...
Zero-shot image classification has made promising progress by training t...
Novel computer vision architectures monopolize the spotlight, but the im...
We present BoTNet, a conceptually simple yet powerful backbone architect...
Building instance segmentation models that are data-efficient and can ha...
We present iNeRF, a framework that performs pose estimation by "invertin...
Recently, SpineNet has demonstrated promising results on object detectio...
Object recognition has seen significant progress in the image domain, wi...
Pre-training is a dominant paradigm in computer vision. For example,
sup...
We leverage unsupervised learning of depth, egomotion, and camera intrin...
Convolutional neural networks typically encode an input image into a ser...
Despite the blooming success of architecture search for vision tasks in
...
Data augmentation is a critical component of training deep learning mode...
Current state-of-the-art convolutional architectures for object detectio...
Instance segmentation aims to detect and segment individual objects in a...
With the rapid increase of large-scale, real-world datasets, it becomes
...
Deep neural networks often work well when they are over-parameterized an...
The highest accuracy object detectors to date are based on a two-stage
a...
Feature pyramids are a basic component in recognition systems for detect...
The recent COCO object detection dataset presents several new challenges...
Object segmentation requires both object-level information and low-level...
In this paper we describe the Microsoft COCO Caption dataset and evaluat...
We present a new dataset with the goal of advancing the state-of-the-art...