
-
Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation
In semi-supervised domain adaptation, a few labeled samples per class in...
read it
-
Deep Transformers for Fast Small Intestine Grounding in Capsule Endoscope Video
Capsule endoscopy is an evolutional technique for examining and diagnosi...
read it
-
Scene-Intuitive Agent for Remote Embodied Visual Grounding
Humans learn from life events to form intuitions towards the understandi...
read it
-
Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting
Crowd counting is a fundamental yet challenging problem, which desires r...
read it
-
Human-centric Spatio-Temporal Video Grounding With Visual Transformers
In this work, we introduce a novel task - Humancentric Spatio-Temporal V...
read it
-
A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack and Learning
Although deep convolutional neural networks (CNNs) have demonstrated rem...
read it
-
Contralaterally Enhanced Networks for Thoracic Disease Detection
Identifying and locating diseases in chest X-rays are very challenging, ...
read it
-
Linguistic Structure Guided Context Modeling for Referring Image Segmentation
Referring image segmentation aims to predict the foreground mask of the ...
read it
-
Referring Image Segmentation via Cross-Modal Progressive Comprehension
Referring image segmentation aims at segmenting the foreground masks of ...
read it
-
Reinforcement Learning for Weakly Supervised Temporal Grounding of Natural Language in Untrimmed Videos
Temporal grounding of natural language in untrimmed videos is a fundamen...
read it
-
Collaborative Training between Region Proposal Localization and Classification for Domain Adaptive Object Detection
Object detectors are usually trained with large amount of labeled data, ...
read it
-
Online Alternate Generator against Adversarial Attacks
The field of computer vision has witnessed phenomenal progress in recent...
read it
-
Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision Action Recognition
Existing vision-based action recognition is susceptible to occlusion and...
read it
-
Active Object Search
In this work, we investigate an Active Object Search (AOS) task that is ...
read it
-
Peeking into occluded joints: A novel framework for crowd pose estimation
Although occlusion widely exists in nature and remains a fundamental cha...
read it
-
Efficient Crowd Counting via Structured Knowledge Transfer
Crowd counting is an application-oriented task and its inference efficie...
read it
-
Depthwise Non-local Module for Fast Salient Object Detection Using a Single Thread
Recently deep convolutional neural networks have achieved significant su...
read it
-
Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video
Temporally language grounding in untrimmed videos is a newly-raised task...
read it
-
Physical-Virtual Collaboration Graph Network for Station-Level Metro Ridership Prediction
Due to the widespread applications in real-world scenarios, metro riders...
read it
-
An Adversarial Perturbation Oriented Domain Adaptation Approach for Semantic Segmentation
We focus on Unsupervised Domain Adaptation (UDA) for the task of semanti...
read it
-
Globally Guided Progressive Fusion Network for 3D Pancreas Segmentation
Recently 3D volumetric organ segmentation attracts much research interes...
read it
-
Self-Enhanced Convolutional Network for Facial Video Hallucination
As a domain-specific super-resolution problem, facial image hallucinatio...
read it
-
Knowledge Graph Transfer Network for Few-Shot Recognition
Few-shot learning aims to learn novel categories from very few samples g...
read it
-
Generalizing Energy-based Generative ConvNets from Particle Evolution Perspective
Compared with Generative Adversarial Networks (GAN), the Energy-Based ge...
read it
-
Dynamic Graph Attention for Referring Expression Comprehension
Referring expression comprehension aims to locate the object instance de...
read it
-
A Real-Time Cross-modality Correlation Filtering Method for Referring Expression Comprehension
Referring expression comprehension aims to localize the object instance ...
read it
-
Motion Guided Attention for Video Salient Object Detection
Video salient object detection aims at discovering the most visually dis...
read it
-
ACFM: A Dynamic Spatial-Temporal Network for Traffic Prediction
As a crucial component in intelligent transportation systems, crowd flow...
read it
-
Fashion Retrieval via Graph Reasoning Networks on a Similarity Pyramid
Matching clothing images from customers and online shopping stores has r...
read it
-
Crowd Counting with Deep Structured Scale Integration Network
Automatic estimation of the number of people in unconstrained crowded sc...
read it
-
Semi-Supervised Video Salient Object Detection Using Pseudo-Labels
Deep learning-based video salient object detection has recently achieved...
read it
-
Semi-supervised Skin Detection by Network with Mutual Guidance
In this paper we present a new data-driven method for robust skin detect...
read it
-
Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching
A broad range of cross-m-domain generation researches boil down to match...
read it
-
Cross-Modal Relationship Inference for Grounding Referring Expressions
Grounding referring expressions is a fundamental yet challenging task fa...
read it
-
Contextualized Spatial-Temporal Network for Taxi Origin-Destination Demand Prediction
Taxi demand prediction has recently attracted increasing research intere...
read it
-
ROSA: Robust Salient Object Detection against Adversarial Attacks
Recently salient object detection has witnessed remarkable improvement o...
read it
-
Face Hallucination by Attentive Sequence Optimization with Reinforcement Learning
Face hallucination is a domain-specific super-resolution problem that ai...
read it
-
Non-Local Context Encoder: Robust Biomedical Image Segmentation against Adversarial Attacks
Recent progress in biomedical image segmentation based on deep convoluti...
read it
-
Semantic Relationships Guided Representation Learning for Facial Action Unit Recognition
Facial action unit (AU) recognition is a crucial task for facial express...
read it
-
Harvesting Visual Objects from Internet Images via Deep Learning Based Objectness Assessment
The collection of internet images has been growing in an astonishing spe...
read it
-
Deep RBFNet: Point Cloud Feature Learning using Radial Basis Functions
Three-dimensional object recognition has recently achieved great progres...
read it
-
Facial Landmark Machines: A Backbone-Branches Architecture with Progressive Representation Learning
Facial landmark localization plays a critical role in face recognition a...
read it
-
FRAME Revisited: An Interpretation View Based on Particle Evolution
FRAME (Filters, Random fields, And Maximum Entropy) is an energy-based d...
read it
-
Unsupervised Domain Adaptation: An Adaptive Feature Norm Approach
Unsupervised domain adaptation aims to mitigate the domain shift when tr...
read it
-
Cross-Modal Attentional Context Learning for RGB-D Object Detection
Recognizing objects from simultaneously sensed photometric (RGB) and dep...
read it
-
Learning Deep Representations for Semantic Image Parsing: a Comprehensive Overview
Semantic image parsing, which refers to the process of decomposing image...
read it
-
Attentive Crowd Flow Machines
Traffic flow prediction is crucial for urban traffic management and publ...
read it
-
Non-locally Enhanced Encoder-Decoder Network for Single Image De-raining
Single image rain streaks removal has recently witnessed substantial pro...
read it
-
Crowd Counting using Deep Recurrent Spatial-Aware Network
Crowd counting from unconstrained scene images is a crucial task in many...
read it
-
Visual Question Reasoning on General Dependency Tree
The collaborative reasoning for understanding each image-question pair i...
read it