
-
Instance-Aware Predictive Navigation in Multi-Agent Environments
In this work, we aim to achieve efficient end-to-end learning of driving...
read it
-
Minimax Active Learning
Active learning aims to develop label-efficient algorithms by querying t...
read it
-
Temporal Action Detection with Multi-level Supervision
Training temporal action detection in videos requires large amounts of l...
read it
-
Modularity Improves Out-of-Domain Instruction Following
We propose a modular architecture for following natural language instruc...
read it
-
Auxiliary Task Reweighting for Minimum-data Learning
Supervised learning requires a large amount of training data, limiting i...
read it
-
Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation
The success of supervised learning hinges on the assumption that the tra...
read it
-
Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting
The goal of continual learning (CL) is to learn a sequence of tasks with...
read it
-
Evaluating Self-Supervised Pretraining Without Using Labels
A common practice in unsupervised representation learning is to use labe...
read it
-
Identity-Aware Multi-Sentence Video Description
Standard video and movie description tasks abstract away from person ide...
read it
-
What Should Not Be Contrastive in Contrastive Learning
Recent self-supervised contrastive methods have been able to produce imp...
read it
-
Body2Hands: Learning to Infer 3D Hands from Conversational Gesture Body Dynamics
We propose a novel learned deep prior of body motion for 3D hand shape s...
read it
-
Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation
We introduce a learning-based approach for room navigation using semanti...
read it
-
Video Prediction via Example Guidance
In video prediction tasks, one major challenge is to capture the multi-m...
read it
-
Compositional Video Synthesis with Action Graphs
Videos of actions are complex spatio-temporal signals, containing rich c...
read it
-
Fully Test-time Adaptation by Entropy Minimization
Faced with new and different data during testing, a model must adapt its...
read it
-
Quasi-Dense Instance Similarity Learning
Similarity metrics for instances have drawn much attention, due to their...
read it
-
Reducing Class Collapse in Metric Learning with Easy Positive Sampling
Metric learning seeks perceptual embeddings where visually similar insta...
read it
-
ParkPredict: Motion and Intent Prediction of Vehicles in Parking Lots
We investigate the problem of predicting driver behavior in parking lots...
read it
-
Contrastive Examples for Addressing the Tyranny of the Majority
Computer vision algorithms, e.g. for face recognition, favour groups of ...
read it
-
Spatio-Temporal Action Detection with Multi-Object Interaction
Spatio-temporal action detection in videos requires localizing the actio...
read it
-
Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning
Weakly-supervised action localization problem requires training a model ...
read it
-
Revisiting Few-shot Activity Detection with Class Similarity Control
Many interesting events in the real world are rare making preannotated m...
read it
-
Adversarial Continual Learning
Continual learning aims to learn new tasks without forgetting previously...
read it
-
Frustratingly Simple Few-Shot Object Detection
Detecting rare objects from a few examples is an emerging problem. Prior...
read it
-
Rethinking Image Mixture for Unsupervised Visual Representation Learning
In supervised learning, smoothing label/prediction distribution in neura...
read it
-
A New Meta-Baseline for Few-Shot Learning
Meta-learning has become a popular framework for few-shot learning in re...
read it
-
Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning
Learning robotic manipulation tasks using reinforcement learning with sp...
read it
-
Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks
Human action is naturally compositional: humans can easily recognize and...
read it
-
Learning Canonical Representations for Scene Graph to Image Generation
Generating realistic images of complex visual scenes becomes very challe...
read it
-
Semantic Bottleneck Scene Generation
Coupling the high-fidelity generation capabilities of label-conditional ...
read it
-
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Many visual scenes contain text that carries crucial information, and it...
read it
-
Plan Arithmetic: Compositional Plan Vectors for Multi-Task Control
Autonomous agents situated in real-world environments must be able to ma...
read it
-
Regularization Matters in Policy Optimization
Deep Reinforcement Learning (Deep RL) has been receiving increasingly mo...
read it
-
Transferable Recognition-Aware Image Processing
Recent progress in image recognition has stimulated the deployment of vi...
read it
-
Scoring-Aggregating-Planning: Learning task-agnostic priors from interactions and sparse rewards for zero-shot generalization
Humans can learn task-agnostic priors from interactive experience and ut...
read it
-
Unsupervised Domain Adaptation through Self-Supervision
This paper addresses unsupervised domain adaptation, the setting where l...
read it
-
Dynamic Scale Inference by Entropy Minimization
Given the variety of the visual world there is not one true scale for re...
read it
-
Task-Aware Deep Sampling for Feature Generation
The human ability to imagine the variety of appearances of novel objects...
read it
-
Uncertainty-guided Continual Learning with Bayesian Neural Networks
Continual learning aims to learn new tasks without forgetting previously...
read it
-
Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation
Vision-and-Language Navigation (VLN) requires grounding instructions, su...
read it
-
Monocular Plan View Networks for Autonomous Driving
Convolutions on monocular dash cam videos capture spatial invariances in...
read it
-
Language-Conditioned Graph Networks for Relational Reasoning
Solving grounded language tasks often requires reasoning about relations...
read it
-
Accurate Visual Localization for Automotive Applications
Accurate vehicle localization is a crucial step towards building effecti...
read it
-
Blurring the Line Between Structure and Learning to Optimize and Adapt Receptive Fields
The visual world is vast and varied, but its variations divide into stru...
read it
-
Semi-supervised Domain Adaptation via Minimax Entropy
Contemporary domain adaptation methods are very effective at aligning fe...
read it
-
TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning
Learning good feature embeddings for images often requires substantial t...
read it
-
Variational Adversarial Active Learning
Active learning aims to develop label-efficient algorithms by sampling t...
read it
-
Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity
Contemporary sensorimotor learning approaches typically start with an ex...
read it
-
Viewpoint Invariant Change Captioning
The ability to detect that something has changed in an environment is va...
read it
-
Similarity R-C3D for Few-shot Temporal Activity Detection
Many activities of interest are rare events, with only a few labeled exa...
read it