
-
RPVNet: A Deep and Efficient Range-Point-Voxel Fusion Network for LiDAR Point Cloud Segmentation
Point clouds can be represented in many forms (views), typically, point-...
read it
-
Modulating Localization and Classification for Harmonized Object Detection
Object detection involves two sub-tasks, i.e. localizing objects in an i...
read it
-
Ground-SLAM: Ground Constrained LiDAR SLAM for Structured Multi-Floor Environments
This paper proposes a 3D LiDAR SLAM algorithm named Ground-SLAM, which e...
read it
-
Multi-Level Adaptive Region of Interest and Graph Learning for Facial Action Unit Recognition
In facial action unit (AU) recognition tasks, regional feature learning ...
read it
-
Self-Domain Adaptation for Face Anti-Spoofing
Although current face anti-spoofing methods achieve promising results un...
read it
-
Self-Supervised Noisy Label Learning for Source-Free Unsupervised Domain Adaptation
It is a strong prerequisite to access source data freely in many existin...
read it
-
Box Re-Ranking: Unsupervised False Positive Suppression for Domain Adaptive Pedestrian Detection
False positive is one of the most serious problems brought by agnostic d...
read it
-
A Free Lunch for Unsupervised Domain Adaptive Object Detection without Source Data
Unsupervised domain adaptation (UDA) assumes that source and target doma...
read it
-
MANGO: A Mask Attention Guided One-Stage Scene Text Spotter
Recently end-to-end scene text spotting has become a popular research to...
read it
-
Learning Open Set Network with Discriminative Reciprocal Points
Open set recognition is an emerging research area that aims to simultane...
read it
-
PolarDet: A Fast, More Precise Detector for Rotated Target in Aerial Images
Fast and precise object detection for high-resolution aerial images has ...
read it
-
Joint Semantics and Data-Driven Path Representation for Knowledge Graph Inference
Inference on a large-scale knowledge graph (KG) is of great importance f...
read it
-
AutoETER: Automated Entity Type Representation for Knowledge Graph Embedding
Recent advances in Knowledge Graph Embed-ding (KGE) allow for representi...
read it
-
MAFF-Net: Filter False Positive for 3D Vehicle Detection with Multi-modal Adaptive Feature Fusion
3D vehicle detection based on multi-modal fusion is an important task of...
read it
-
RangeRCNN: Towards Fast and Accurate 3D Object Detection with Range Image Representation
We present RangeRCNN, a novel and effective 3D object detection framewor...
read it
-
Two Step Joint Model for Drug Drug Interaction Extraction
When patients need to take medicine, particularly taking more than one k...
read it
-
Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling
Visual Storytelling (VIST) is a task to tell a narrative story about a c...
read it
-
Dynamic GCN: Context-enriched Topology Learning for Skeleton-based Action Recognition
Graph Convolutional Networks (GCNs) have attracted increasing interests ...
read it
-
Learning a Domain Classifier Bank for Unsupervised Adaptive Object Detection
In real applications, object detectors based on deep networks still face...
read it
-
Text Recognition in Real Scenarios with a Few Labeled Samples
Scene text recognition (STR) is still a hot research topic in computer v...
read it
-
Unsupervised Image Classification for Deep Representation Learning
Deep clustering against self-supervised learning is a very important and...
read it
-
TRIE: End-to-End Text Reading and Information Extraction for Document Understanding
Since real-world ubiquitous documents (e.g., invoices, tickets, resumes ...
read it
-
SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition
Arbitrary text appearance poses a great challenge in scene text recognit...
read it
-
Object-QA: Towards High Reliable Object Quality Assessment
In object recognition applications, object images usually appear with di...
read it
-
IROS 2019 Lifelong Robotic Vision Challenge – Lifelong Object Recognition Report
This report summarizes IROS 2019-Lifelong Robotic Vision Competition (Li...
read it
-
Counterfactual Samples Synthesizing for Robust Visual Question Answering
Despite Visual Question Answering (VQA) has realized impressive progress...
read it
-
Neural Inheritance Relation Guided One-Shot Layer Assignment Search
Layer assignment is seldom picked out as an independent research topic i...
read it
-
Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units
Recurrent neural network (RNN) has been widely studied in sequence learn...
read it
-
Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting
Many approaches have recently been proposed to detect irregular scene te...
read it
-
An End-to-End Audio Classification System based on Raw Waveforms and Mix-Training Strategy
Audio classification can distinguish different kinds of sounds, which is...
read it
-
Adversarial Seeded Sequence Growing for Weakly-Supervised Temporal Action Localization
Temporal action localization is an important yet challenging research to...
read it
-
Learned Quality Enhancement via Multi-Frame Priors for HEVC Compliant Low-Delay Applications
Networked video applications, e.g., video conferencing, often suffer fro...
read it
-
Posterior-regularized REINFORCE for Instance Selection in Distant Supervision
This paper provides a new way to improve the efficiency of the REINFORCE...
read it
-
Extreme Image Compression via Multiscale Autoencoders With Generative Adversarial Optimization
We propose a MultiScale AutoEncoder(MSAE) based extreme image compressio...
read it
-
All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification
Shift operation is an efficient alternative over depthwise separable con...
read it
-
Efficient Video Scene Text Spotting: Unifying Detection, Tracking, and Recognition
This paper proposes an unified framework for efficiently spotting scene ...
read it
-
Collaborative Spatio-temporal Feature Learning for Video Action Recognition
Spatio-temporal feature learning is of central importance for action rec...
read it
-
Cross-relation Cross-bag Attention for Distantly-supervised Relation Extraction
Distant supervision leverages knowledge bases to automatically label ins...
read it
-
A Layer Decomposition-Recomposition Framework for Neuron Pruning towards Accurate Lightweight Networks
Neuron pruning is an efficient method to compress the network into a sli...
read it
-
Learning Incremental Triplet Margin for Person Re-identification
Person re-identification (ReID) aims to match people across multiple non...
read it
-
Scene Dynamics: Counterfactual Critic Multi-Agent Training for Scene Graph Generation
Scene graphs -- objects as nodes and visual relationships as edges -- de...
read it
-
Segregated Temporal Assembly Recurrent Networks for Weakly Supervised Multiple Action Detection
This paper proposes a segregated temporal assembly recurrent (STAR) netw...
read it
-
Extreme Network Compression via Filter Group Approximation
In this paper we propose a novel decomposition method based on filter gr...
read it
-
Small-scale Pedestrian Detection Based on Somatic Topology Localization and Temporal Feature Aggregation
A critical issue in pedestrian detection is to detect small-scale object...
read it
-
A practical convolutional neural network as loop filter for intra frame
Loop filters are used in video coding to remove artifacts or improve per...
read it
-
Edit Probability for Scene Text Recognition
We consider the scene text recognition problem under the attention-based...
read it
-
Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation
Skeleton-based human action recognition has recently drawn increasing at...
read it
-
Arbitrarily-Oriented Text Recognition
Recognizing text from natural images is still a hot research topic in co...
read it
-
Cascade Region Proposal and Global Context for Deep Object Detection
Deep region-based object detector consists of a region proposal step and...
read it
-
Focusing Attention: Towards Accurate Text Recognition in Natural Images
Scene text recognition has been a hot research topic in computer vision ...
read it