
-
Reasoning over Vision and Language: Exploring the Benefits of Supplemental Knowledge
The limits of applicability of vision-and-language models are defined by...
read it
-
Deep Multi-task Learning for Depression Detection and Prediction in Longitudinal Data
Depression is among the most prevalent mental disorders, affecting milli...
read it
-
DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution
Previous top-performing approaches for point cloud instance segmentation...
read it
-
Deep Reinforcement Learning for Unknown Anomaly Detection
We address a critical yet largely unsolved anomaly detection problem, in...
read it
-
Object-and-Action Aware Model for Visual Language Navigation
Vision-and-Language Navigation (VLN) is unique in that it requires turni...
read it
-
Deep Learning for Anomaly Detection: A Review
Anomaly detection, a.k.a. outlier detection, has been a lasting yet acti...
read it
-
Structured Multimodal Attentions for TextVQA
Text based Visual Question Answering (TextVQA) is a recently raised chal...
read it
-
Visual Question Answering with Prior Class Semantics
We present a novel mechanism to embed prior knowledge in a model for vis...
read it
-
Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision
One of the primary challenges limiting the applicability of deep learnin...
read it
-
Self-trained Deep Ordinal Regression for End-to-End Video Anomaly Detection
Video anomaly detection is of critical practical importance to a variety...
read it
-
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering
Visual Question Answering (VQA) methods have made incredible progress, b...
read it
-
Learning to Zoom-in via Learning to Zoom-out: Real-world Super-resolution by Generating and Adapting Degradation
Most learning-based super-resolution (SR) methods aim to recover high-re...
read it
-
Deep Anomaly Detection with Deviation Networks
Although deep learning has been applied to successfully address many dat...
read it
-
Weakly-supervised Deep Anomaly Detection with Pairwise Relation Learning
This paper studies a rarely explored but critical anomaly detection prob...
read it
-
On Incorporating Semantic Prior Knowlegde in Deep Learning Through Embedding-Space Constraints
The knowledge that humans hold about a problem often extends far beyond ...
read it
-
V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive Matrices
One of the primary challenges faced by deep learning is the degree to wh...
read it
-
An Effective Two-Branch Model-Based Deep Network for Single Image Deraining
Removing rain effects from an image automatically has many applications ...
read it
-
Attention-guided Network for Ghost-free High Dynamic Range Imaging
Ghosting artifacts caused by moving objects or misalignments is a key ch...
read it
-
RERERE: Remote Embodied Referring Expressions in Real indoor Environments
One of the long-term challenges of robotics is to enable humans to commu...
read it
-
Reinforcement Learning with Attention that Works: A Self-Supervised Approach
Attention models have had a significant positive impact on deep learning...
read it
-
Actively Seeking and Learning from Live Data
One of the key limitations of traditional machine learning methods is th...
read it
-
Memorizing Normality to Detect Anomaly: Memory-augmented Deep Autoencoder for Unsupervised Anomaly Detection
Deep autoencoder has been extensively used for anomaly detection. Traini...
read it
-
What's to know? Uncertainty as a Guide to Asking Goal-oriented Questions
One of the core challenges in Visual Dialogue problems is asking the que...
read it
-
An Active Information Seeking Model for Goal-oriented Vision-and-Language Tasks
As Computer Vision algorithms move from passive analysis of pixels to ac...
read it
-
Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks
The task in referring expression comprehension is to localise the object...
read it
-
Visual Question Answering as Reading Comprehension
Visual question answering (VQA) demands simultaneous comprehension of bo...
read it
-
End-to-End Diagnosis and Segmentation Learning from Cardiac Magnetic Resonance Imaging
Cardiac magnetic resonance (CMR) is used extensively in the diagnosis an...
read it
-
MPTV: Matching Pursuit Based Total Variation Minimization for Image Deconvolution
Total variation (TV) regularization has proven effective for a range of ...
read it
-
Towards Effective Deep Embedding for Zero-Shot Learning
Zero-shot learning (ZSL) attempts to recognize visual samples of unseen ...
read it
-
Adaptive Importance Learning for Improving Lightweight Image Super-resolution Network
Deep neural networks have achieved remarkable success in single image su...
read it
-
Learning an Optimizer for Image Deconvolution
As an integral component of blind image deblurring, non-blind deconvolut...
read it
-
Real-time Semantic Image Segmentation via Spatial Sparsity
We propose an approach to semantic (image) segmentation that reduces the...
read it
-
Visual Question Answering as a Meta Learning Task
The predominant approach to Visual Question Answering (VQA) demands that...
read it
-
Asking the Difficult Questions: Goal-Oriented Visual Question Generation via Intermediate Rewards
Despite significant progress in a variety of vision-and-language problem...
read it
-
Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning
The Visual Dialogue task requires an agent to engage in a conversation a...
read it
-
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
A robot that can carry out a natural-language instruction has been a dre...
read it
-
Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries
Recognising objects according to a pre-defined fixed set of class labels...
read it
-
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
This paper presents a state-of-the-art model for visual question answeri...
read it
-
Beyond Low Rank: A Data-Adaptive Tensor Completion Method
Low rank tensor representation underpins much of recent progress in tens...
read it
-
When Unsupervised Domain Adaptation Meets Tensor Representations
Domain adaption (DA) allows machine learning methods trained on data sam...
read it
-
Visually Aligned Word Embeddings for Improving Zero-shot Learning
Zero-shot learning (ZSL) highly depends on a good semantic embedding to ...
read it
-
Visual Question Answering with Memory-Augmented Networks
This paper exploits a memory-augmented neural network to predict accurat...
read it
-
Bayesian Conditional Generative Adverserial Networks
Traditional GANs use a deterministic generator function (typically a neu...
read it
-
Care about you: towards large-scale human-centric visual relationship detection
Visual relationship detection aims to capture interactions between pairs...
read it
-
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions
One of the most intriguing features of the Visual Question Answering (VQ...
read it
-
From Motion Blur to Motion Flow: a Deep Learning Solution for Removing Heterogeneous Motion Blur
Removing pixel-wise heterogeneous motion blur is challenging due to the ...
read it
-
Wider or Deeper: Revisiting the ResNet Model for Visual Recognition
The trend towards increasingly deep neural networks has been driven by a...
read it
-
Sequential Person Recognition in Photo Albums with a Recurrent Network
Recognizing the identities of people in everyday photos is still a very ...
read it
-
Zero-Shot Visual Question Answering
Part of the appeal of Visual Question Answering (VQA) is its promise to ...
read it
-
The Shallow End: Empowering Shallower Deep-Convolutional Networks through Auxiliary Outputs
The depth is one of the key factors behind the great success of convolut...
read it