
-
LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering
Although significant progress has been made in room layout estimation, m...
read it
-
Toward Robust Long Range Policy Transfer
Humans can master a new task within a few trials by drawing upon skills ...
read it
-
Interactive Radiotherapy Target Delineation with 3D-Fused Context Propagation
Gross tumor volume (GTV) delineation on tomography medical imaging is cr...
read it
-
HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features
We present HoHoNet, a versatile and efficient framework for holistic und...
read it
-
Lymph Node Gross Tumor Volume Detection in Oncology Imaging via Relationship Learning Using Graph Neural Network
Determining the spread of GTV_LN is essential in defining the respective...
read it
-
LayoutMP3D: Layout Annotation of Matterport3D
Inferring the information of 3D layout from a single equirectangular pan...
read it
-
Visual Question Answering on 360° Images
In this work, we introduce VQA 360, a novel task of visual question answ...
read it
-
Bias-Aware Heapified Policy for Active Learning
The data efficiency of learning-based algorithms is more and more import...
read it
-
360SD-Net: 360° Stereo Depth Estimation with Learnable Cost Volume
Recently, end-to-end trainable deep neural networks have significantly i...
read it
-
Autonomous UAV Landing System Based on Visual Navigation
In this paper, we present an autonomous unmanned aerial vehicle (UAV) la...
read it
-
360-Indoor: Towards Learning Real-World Objects in 360° Indoor Equirectangular Images
While there are several widely used object detection datasets, current c...
read it
-
Flat2Layout: Flat Representation for Estimating Layout of General Room Types
This paper proposes a new approach, Flat2Layout, for estimating general ...
read it
-
Radiotherapy Target Contouring with Convolutional Gated Graph Neural Network
Tomography medical imaging is essential in the clinical workflow of mode...
read it
-
3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization
The complementary characteristics of active and passive depth sensing te...
read it
-
Point-to-Point Video Generation
While image manipulation achieves tremendous breakthroughs (e.g., genera...
read it
-
Learning a Multi-Modal Policy via Imitating Demonstrations with Mixed Behaviors
We propose a novel approach to train a multi-modal policy from mixed dem...
read it
-
HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation
We present a new approach to the problem of estimating 3D room layout fr...
read it
-
DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama
We present a deep learning framework, called DuLa-Net, to predict Manhat...
read it
-
Joint Monocular 3D Vehicle Detection and Tracking
3D vehicle detection and tracking from a monocular camera requires detec...
read it
-
InstaNAS: Instance-aware Neural Architecture Search
Neural Architecture Search (NAS) aims at finding one "single" architectu...
read it
-
Self-Supervised Learning of Depth and Camera Motion from 360° Videos
As 360 cameras become prevalent in many autonomous systems (e.g., self-d...
read it
-
Unsupervised Stylish Image Description Generation via Domain Layer Norm
Most of the existing works on image description focus on generating expr...
read it
-
Searching Toward Pareto-Optimal Device-Aware Neural Architectures
Recent breakthroughs in Neural Architectural Search (NAS) have achieved ...
read it
-
Liquid Pouring Monitoring via Rich Sensory Inputs
Humans have the amazing ability to perform very subtle manipulation task...
read it
-
Leveraging Motion Priors in Videos for Improving Human Segmentation
Despite many advances in deep-learning based semantic segmentation, perf...
read it
-
Efficient Uncertainty Estimation for Semantic Segmentation in Videos
Uncertainty estimation in deep learning becomes more important recently....
read it
-
DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures
Recent breakthroughs in Neural Architectural Search (NAS) have achieved ...
read it
-
Cube Padding for Weakly-Supervised Saliency Prediction in 360° Videos
Automatic saliency prediction in 360 videos is critical for viewpoint gu...
read it
-
A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss
We propose a unified model combining the strength of extractive and abst...
read it
-
Omnidirectional CNN for Visual Place Recognition and Navigation
Visual place recognition is challenging, especially when only a few pla...
read it
-
Compatibility Family Learning for Item Recommendation and Generation
Compatibility between items, such as clothes and shoes, is a major facto...
read it
-
Self-view Grounding Given a Narrated 360° Video
Narrated 360 videos are typically provided in many touring scenarios to ...
read it
-
Anticipating Daily Intention using On-Wrist Motion Triggered Sensing
Anticipating human intention by observing one's actions has many applica...
read it
-
Detecting Adversarial Attacks on Neural Network Policies with Visual Foresight
Deep reinforcement learning has shown promising results in learning cont...
read it
-
Agent-Centric Risk Assessment: Accident Anticipation and Risky Region Localization
For survival, a living agent must have the ability to assess risk (1) by...
read it
-
Deep 360 Pilot: Learning a Deep Agent for Piloting through 360° Sports Video
Watching a 360 sports video requires a viewer to continuously select a v...
read it
-
Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner
Impressive image captioning results are achieved in domains with plenty ...
read it
-
No More Discrimination: Cross City Adaptation of Road Scene Segmenters
Despite the recent success of deep-learning based semantic segmentation,...
read it
-
Tactics of Adversarial Attack on Deep Reinforcement Learning Agents
We introduce two tactics to attack agents trained by deep reinforcement ...
read it
-
Learning to Compose with Professional Photographs on the Web
Photo composition is an important factor affecting the aesthetics in pho...
read it
-
Leveraging Video Descriptions to Learn Video Question Answering
We propose a scalable approach to learn video-based question answering (...
read it
-
Title Generation for User Generated Videos
A great video title describes the most salient event compactly and captu...
read it
-
Recognition from Hand Cameras
We revisit the study of a wrist-mounted camera system (referred to as Ha...
read it