
-
Semantic Scene Completion via Integrating Instances and Scene in-the-Loop
Semantic Scene Completion aims at reconstructing a complete 3D scene wit...
read it
-
Fixing the Teacher-Student Knowledge Discrepancy in Distillation
Training a small student network with the guidance of a larger teacher n...
read it
-
Learning Fine-Grained Segmentation of 3D Shapes without Part Labels
Learning-based 3D shape segmentation is usually formulated as a semantic...
read it
-
DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network
Conditional generative adversarial networks (cGANs) target at synthesizi...
read it
-
PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection
3D object detection is receiving increasing attention from both industry...
read it
-
Fast Convergence of DETR with Spatially Modulated Co-Attention
The recently proposed Detection Transformer (DETR) model successfully ap...
read it
-
Probabilistic Graph Attention Network with Conditional Kernels for Pixel-Wise Prediction
Multi-scale representations deeply learned via convolutional neural netw...
read it
-
A Holistically-Guided Decoder for Deep Representation Learning with Applications to Semantic Segmentation and Object Detection
Both high-level and high-resolution feature representations are of great...
read it
-
End-to-End Object Detection with Adaptive Clustering Transformer
End-to-end Object Detection with Transformer (DETR)proposes to perform o...
read it
-
A Self-supervised Cascaded Refinement Network for Point Cloud Completion
Point clouds are often sparse and incomplete, which imposes difficulties...
read it
-
Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation
We propose a general framework for searching surrogate losses for mainst...
read it
-
Deformable DETR: Deformable Transformers for End-to-End Object Detection
DETR has been recently proposed to eliminate the need for many hand-desi...
read it
-
Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions
We propose a novel algorithm, named Open-Edit, which is the first attemp...
read it
-
Point Cloud Completion by Learning Shape Priors
In view of the difficulty in reconstructing object details in point clou...
read it
-
Gradient Regularized Contrastive Learning for Continual Domain Adaptation
Human beings can quickly adapt to environmental changes by leveraging le...
read it
-
Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation
Stereophonic audio is an indispensable ingredient to enhance human audit...
read it
-
3D Human Mesh Regression with Dense Correspondence
Estimating 3D mesh of the human body from a single 2D image is an import...
read it
-
StereoGAN: Bridging Synthetic-to-Real Domain Gap by Joint Optimization of Domain Translation and Stereo Matching
Large-scale synthetic datasets are beneficial to stereo matching but usu...
read it
-
Cascaded Refinement Network for Point Cloud Completion
Point clouds are often sparse and incomplete. Existing shape completion ...
read it
-
1st Place Solutions for OpenImage2019 – Object Detection and Instance Segmentation
This article introduces the solutions of the two champion teams, `MMfrui...
read it
-
KPNet: Towards Minimal Face Detector
The small receptive field and capacity of minimal neural networks limit ...
read it
-
Revisiting the Sibling Head in Object Detector
The “shared head for classification and localization” (sibling head), fi...
read it
-
Adapting Object Detectors with Conditional Domain Normalization
Real-world object detectors are often challenged by the domain gaps betw...
read it
-
Channel Equilibrium Networks for Learning Deep Representation
Convolutional Neural Networks (CNNs) are typically constructed by stacki...
read it
-
Monocular 3D Object Detection with Decoupled Structured Polygon Estimation and Height-Guided Depth Estimation
Monocular 3D object detection task aims to predict the 3D bounding boxes...
read it
-
Single Image Dehazing Using Ranking Convolutional Neural Network
Single image dehazing, which aims to recover the clear image solely from...
read it
-
PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
We present a novel and high-performance 3D object detection framework, n...
read it
-
Search to Distill: Pearls are Everywhere but not the Eyes
Standard Knowledge Distillation (KD) approaches distill the knowledge of...
read it
-
Fetal cardiovascular decompensation during labor predicted from the individual heart rate: a prospective study in fetal sheep near term and the impact of low sampling rate
We present a novel computerized fetal heart rate intrapartum algorithm f...
read it
-
Vision-Infused Deep Audio Inpainting
Multi-modality perception is essential to develop interactive intelligen...
read it
-
Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis
Semantic image synthesis aims at generating photorealistic images from s...
read it
-
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
Text-image cross-modal retrieval is a challenging task in the field of l...
read it
-
Differentiable Learning-to-Group Channels via Groupable Convolutional Neural Networks
Group convolution, which divides the channels of ConvNets into groups, h...
read it
-
Differentiable Learning-to-Group Channels viaGroupable Convolutional Neural Networks
Group convolution, which divides the channels of ConvNets into groups, h...
read it
-
Once a MAN: Towards Multi-Target Attack via Learning Multi-Target Adversarial Network Once
Modern deep neural networks are often vulnerable to adversarial samples....
read it
-
Interpolated Convolutional Networks for 3D Point Cloud Understanding
Point cloud is an important type of 3D representation. However, directly...
read it
-
Multi-modality Latent Interaction Network for Visual Question Answering
Exploiting relationships between visual regions and question words have ...
read it
-
Deep Self-Learning From Noisy Labels
ConvNets achieve good results when training from clean data, but learnin...
read it
-
Part-A^2 Net: 3D Part-Aware and Aggregation Neural Network for Object Detection from Point Cloud
In this paper, we propose the part-aware and aggregation neural network ...
read it
-
Finding Task-Relevant Features for Few-Shot Learning by Category Traversal
Few-shot learning is an important area of research. Conceptually, humans...
read it
-
P2SGrad: Refined Gradients for Optimizing Deep Face Models
Cosine-based softmax losses significantly improve the performance of dee...
read it
-
PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph
Despite some exciting progress on high-quality image generation from str...
read it
-
AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations
The cosine-based softmax losses and their variants achieve great success...
read it
-
Disentangling Pose from Appearance in Monochrome Hand Images
Hand pose estimation from the monocular 2D image is challenging due to t...
read it
-
Conditional Adversarial Generative Flow for Controllable Image Synthesis
Flow-based generative models show great potential in image synthesis due...
read it
-
Semantics Disentangling for Text-to-Image Generation
Synthesizing photo-realistic images from text descriptions is a challeng...
read it
-
Context and Attribute Grounded Dense Captioning
Dense captioning aims at simultaneously localizing semantic regions and ...
read it
-
Feature Intertwiner for Object Detection
A well-trained model should classify objects with a unanimous score for ...
read it
-
GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving
We present an efficient 3D object detection framework based on a single ...
read it
-
Video Generation from Single Semantic Label Map
This paper proposes the novel task of video generation conditioned on a ...
read it