
-
Distribution-Free, Risk-Controlling Prediction Sets
To communicate instance-wise uncertainty for prediction tasks, we show h...
read it
-
Reconstructing Hand-Object Interactions in the Wild
In this work we explore reconstructing hand-object interactions in the w...
read it
-
Human Mesh Recovery from Multiple Shots
Videos from edited media like movies are a useful, yet under-explored so...
read it
-
From Goals, Waypoints Paths To Long Term Human Trajectory Forecasting
Human trajectory forecasting is an inherently multi-modal problem. Uncer...
read it
-
Better Knowledge Retention through Metric Learning
In continual learning, new categories may be introduced over time, and a...
read it
-
Robust Policies via Mid-Level Visual Representations: An Experimental Study in Manipulation and Navigation
Vision-based robotics often separates the control loop into one module f...
read it
-
Rearrangement: A Challenge for Embodied AI
We describe a framework for research and evaluation in Embodied AI. Our ...
read it
-
Shape, Illumination, and Reflectance from Shading
A fundamental problem in computer vision is that of inferring the intrin...
read it
-
Uncertainty Sets for Image Classifiers using Conformal Prediction
Convolutional image classifiers can achieve high predictive accuracy, bu...
read it
-
Learning Long-term Visual Dynamics with Region Proposal Interaction Networks
Learning long-term dynamics models is the key to understanding physical ...
read it
-
Perceiving 3D Human-Object Spatial Arrangements from a Single Image in the Wild
We present a method that infers spatial arrangements and shapes of human...
read it
-
Shape and Viewpoint without Keypoints
We present a learning framework that learns to recover the 3D shape, pos...
read it
-
3D Shape Reconstruction from Vision and Touch
When a toddler is presented a new toy, their instinctual behaviour is to...
read it
-
Long-term Human Motion Prediction with Scene Context
Human movement is goal-directed and influenced by the spatial layout of ...
read it
-
Deep Isometric Learning for Visual Recognition
Initialization, normalization, and skip connections are believed to be t...
read it
-
Robust Learning Through Cross-Task Consistency
Visual perception entails solving a wide set of tasks, e.g., object dete...
read it
-
State-Only Imitation Learning for Dexterous Manipulation
Dexterous manipulation has been a long-standing challenge in robotics. R...
read it
-
Inclusive GAN: Improving Data and Minority Coverage in Generative Models
Generative Adversarial Networks (GANs) have brought about rapid progress...
read it
-
Multimodal Image Synthesis with Conditional Implicit Maximum Likelihood Estimation
Many tasks in computer vision and graphics fall within the framework of ...
read it
-
Audiovisual SlowFast Networks for Video Recognition
We present Audiovisual SlowFast Networks, an architecture for integrated...
read it
-
Side-Tuning: Network Adaptation via Additive Side Networks
When training a neural network for a desired task, one may prefer to ada...
read it
-
Learning to Navigate Using Mid-Level Visual Priors
How much does having visual priors about the world (e.g. the fact that t...
read it
-
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera
A comprehensive semantic understanding of a scene is important for many ...
read it
-
Predicting 3D Human Dynamics from Video
Given a video of a person in action, we can easily guess the 3D future m...
read it
-
Learning Individual Styles of Conversational Gesture
Human speech is often accompanied by hand and arm gestures. Given audio ...
read it
-
Mesh R-CNN
Rapid advances in 2D perception have led to systems that accurately dete...
read it
-
Learning Navigation Subroutines by Watching Videos
Hierarchies are an effective way to boost sample efficiency in reinforce...
read it
-
Which Tasks Should Be Learned Together in Multi-task Learning?
Many computer vision applications require solving multiple tasks in real...
read it
-
ShapeMask: Learning to Segment Novel Objects by Refining Shape Priors
Instance segmentation aims to detect and segment individual objects in a...
read it
-
Habitat: A Platform for Embodied AI Research
We present Habitat, a new platform for research in embodied artificial i...
read it
-
Combining Optimal Control and Learning for Visual Navigation in Novel Environments
Model-based control is a popular paradigm for robot navigation because i...
read it
-
Trajectory Normalized Gradients for Distributed Optimization
Recently, researchers proposed various low-precision gradient compressio...
read it
-
Learning Independent Object Motion from Unlabelled Stereoscopic Videos
We present a system for learning motion of independently moving objects ...
read it
-
Mid-Level Visual Representations Improve Generalization and Sample Efficiency for Learning Active Tasks
One of the ultimate promises of computer vision is to help robotic agent...
read it
-
Non-Adversarial Image Synthesis with Generative Latent Nearest Neighbors
Unconditional image generation has recently been dominated by generative...
read it
-
SlowFast Networks for Video Recognition
We present SlowFast networks for video recognition. Our model involves (...
read it
-
Learning 3D Human Dynamics from Video
From an image of a person in action, we can easily guess the 3D motion o...
read it
-
Visual Memory for Robust Path Following
Humans routinely retrace paths in a novel environment both forwards and ...
read it
-
Are All Training Examples Created Equal? An Empirical Study
Modern computer vision algorithms often rely on very large training data...
read it
-
On the Implicit Assumptions of GANs
Generative adversarial nets (GANs) have generated a lot of excitement. D...
read it
-
Diverse Image Synthesis from Semantic Layouts via Conditional IMLE
Most existing methods for conditional image synthesis are only able to g...
read it
-
SFV: Reinforcement Learning of Physical Skills from Videos
Data-driven character animation based on motion capture can produce high...
read it
-
Super-Resolution via Conditional Implicit Maximum Likelihood Estimation
Single-image super-resolution (SISR) is a canonical problem with diverse...
read it
-
Implicit Maximum Likelihood Estimation
Implicit probabilistic models are models defined naturally in terms of a...
read it
-
Cost-Sensitive Active Learning for Intracranial Hemorrhage Detection
Deep learning for clinical applications is subject to stringent performa...
read it
-
Gibson Env: Real-World Perception for Embodied Agents
Developing visual perception models for active agents and sensorimotor c...
read it
-
On Evaluation of Embodied Navigation Agents
Skillful mobile operation in three-dimensional environments is a primary...
read it
-
Learning Instance Segmentation by Interaction
We present an approach for building an active agent that learns to segme...
read it
-
PatchFCN for Intracranial Hemorrhage Detection
This paper studies the problem of detecting acute intracranial hemorrhag...
read it
-
More Than a Feeling: Learning to Grasp and Regrasp using Vision and Touch
For humans, the process of grasping an object relies heavily on rich tac...
read it