
-
What Do Deep Nets Learn? Class-wise Patterns Revealed in the Input Space
Deep neural networks (DNNs) have been widely adopted in different applic...
read it
-
WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection
In recent years, the abuse of a face swap technique called deepfake Deep...
read it
-
Colonoscopy Polyp Detection: Domain Adaptation From Medical Report Images to Real-time Videos
Automatic colorectal polyp detection in colonoscopy video is a fundament...
read it
-
Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition
Humans can easily recognize actions with only a few examples given, whil...
read it
-
Multi-modal Cooking Workflow Construction for Food Recipes
Understanding food recipe requires anticipating the implicit causal effe...
read it
-
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Automatically generating sentences to describe events and temporally loc...
read it
-
Imbalanced Gradients: A New Cause of Overestimated Adversarial Robustness
Evaluating the robustness of a defense model is a challenging task in ad...
read it
-
Long-Term Cloth-Changing Person Re-identification
Person re-identification (Re-ID) aims to match a target person across ca...
read it
-
Sketch-BERT: Learning Sketch Bidirectional Encoder Representation from Transformers by Self-supervised Learning of Sketch Gestalt
Previous researches of sketches often considered sketches in pixel forma...
read it
-
Clean-Label Backdoor Attacks on Video Recognition Models
Deep neural networks (DNNs) are vulnerable to backdoor attacks which can...
read it
-
Learning to Augment Expressions for Few-shot Fine-grained Facial Expression Recognition
Affective computing and cognitive theory are widely used in modern human...
read it
-
LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition
This paper presents LiteEval, a simple yet effective coarse-to-fine fram...
read it
-
Heuristic Black-box Adversarial Attacks on Video Recognition Models
We study the problem of attacking video recognition models in the black-...
read it
-
Black-box Adversarial Attacks on Video Recognition Models
Deep neural networks (DNNs) are known for their vulnerability to adversa...
read it
-
A Multi-task Neural Approach for Emotion Attribution, Classification and Summarization
Emotional content is a crucial ingredient in user-generated videos. Howe...
read it
-
Instance-level Sketch-based Retrieval by Deep Triplet Classification Siamese Network
Sketch has been employed as an effective communicative tool to express t...
read it
-
Composite Binary Decomposition Networks
Binary neural networks have great resource and computing efficiency, whi...
read it
-
Learning to Separate Domains in Generalized Zero-Shot and Open Set Learning: a probabilistic perspective
This paper studies the problem of domain division problem which aims to ...
read it
-
Non-local NetVLAD Encoding for Video Classification
This paper describes our solution for the 2^nd YouTube-8M video understa...
read it
-
Object Detection from Scratch with Deep Supervision
We propose Deeply Supervised Object Detectors (DSOD), an object detectio...
read it
-
NAIS: Neural Attentive Item Similarity Model for Recommendation
Item-to-item collaborative filtering (aka. item-based CF) has been long ...
read it
-
Recurrent Fusion Network for Image Captioning
Recently, much advance has been made in image captioning, and an encoder...
read it
-
Unsupervised Image-to-Image Translation with Stacked Cycle-Consistent Adversarial Networks
Recent studies on unsupervised image-to-image translation have made rema...
read it
-
Semantic Feature Augmentation in Few-shot Learning
A fundamental problem with few-shot learning is the scarcity of data in ...
read it
-
Social Anchor-Unit Graph Regularized Tensor Completion for Large-Scale Image Retagging
Image retagging aims to improve tag quality of social images by refining...
read it
-
Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images
We propose an end-to-end deep learning architecture that produces a 3D s...
read it
-
Learning to score the figure skating sports videos
This paper targets at learning to score the figure skating sports videos...
read it
-
Learning to score and summarize figure skating sport videos
This paper focuses on fully understanding the figure skating sport video...
read it
-
Pose-Normalized Image Generation for Person Re-identification
Person Re-identification (re-id) faces two major challenges: the lack of...
read it
-
Left-Right Skip-DenseNets for Coarse-to-Fine Object Categorization
Inspired by the recent neuroscience studies on the left-right asymmetry ...
read it
-
Recent Advances in Zero-shot Recognition
With the recent renaissance of deep convolution neural networks, encoura...
read it
-
Multi-scale Deep Learning Architectures for Person Re-identification
Person Re-identification (re-id) aims to match people across non-overlap...
read it
-
DSOD: Learning Deeply Supervised Object Detectors from Scratch
We present Deeply Supervised Object Detector (DSOD), a framework that ca...
read it
-
Learning Fashion Compatibility with Bidirectional LSTMs
The ubiquity of online fashion shopping demands effective recommendation...
read it
-
Aggregating Frame-level Features for Large-Scale Video Classification
This paper introduces the system we developed for the Google Cloud & You...
read it
-
Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification
Videos are inherently multimodal. This paper studies the problem of how ...
read it
-
Weakly Supervised Dense Video Captioning
This paper focuses on a novel and challenging vision task, dense video c...
read it
-
Iterative Object and Part Transfer for Fine-Grained Recognition
The aim of fine-grained recognition is to identify sub-ordinate categori...
read it
-
Deep Learning for Video Classification and Captioning
Accelerated by the tremendous increase in Internet bandwidth and storage...
read it
-
The THUMOS Challenge on Action Recognition for Videos "in the Wild"
Automatically recognizing and localizing wide ranges of human actions ha...
read it
-
Heterogeneous Knowledge Transfer in Video Emotion Recognition, Attribution and Summarization
Emotional content is a key element in user-generated videos. However, it...
read it
-
Evaluating Two-Stream CNN for Video Classification
Videos contain very rich semantic information. Traditional hand-crafted ...
read it
-
Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification
Classifying videos according to content semantics is an important proble...
read it
-
Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks
In this paper, we study the challenging problem of categorizing videos a...
read it