Attention-based Context Aggregation Network for Monocular Depth Estimation

01/29/2019
by   Yuru Chen, et al.
6

Depth estimation is a traditional computer vision task, which plays a crucial role in understanding 3D scene geometry. Recently, deep-convolutional-neural-networks based methods have achieved promising results in the monocular depth estimation field. Specifically, the framework that combines the multi-scale features extracted by the dilated convolution based block (atrous spatial pyramid pooling, ASPP) has gained the significant improvement in the dense labeling task. However, the discretized and predefined dilation rates cannot capture the continuous context information that differs in diverse scenes and easily introduce the grid artifacts in depth estimation. In this paper, we propose an attention-based context aggregation network (ACAN) to tackle these difficulties. Based on the self-attention model, ACAN adaptively learns the task-specific similarities between pixels to model the context information. First, we recast the monocular depth estimation as a dense labeling multi-class classification problem. Then we propose a soft ordinal inference to transform the predicted probabilities to continuous depth values, which can reduce the discretization error (about 1 the proposed ACAN aggregates both the image-level and pixel-level context information for depth estimation, where the former expresses the statistical characteristic of the whole image and the latter extracts the long-range spatial dependencies for each pixel. Third, for further reducing the inconsistency between the RGB image and depth map, we construct an attention loss to minimize their information entropy. We evaluate on public monocular depth-estimation benchmark datasets (including NYU Depth V2, KITTI). The experiments demonstrate the superiority of our proposed ACAN and achieve the competitive results with the state of the arts.

READ FULL TEXT

page 1

page 3

page 6

page 7

page 8

page 9

page 10

page 11

research
08/02/2017

Monocular Depth Estimation with Hierarchical Fusion of Dilated CNNs and Soft-Weighted-Sum Inference

Monocular depth estimation is a challenging task in complex compositions...
research
04/25/2023

Depth-Relative Self Attention for Monocular Depth Estimation

Monocular depth estimation is very challenging because clues to the exac...
research
05/12/2023

Learning Monocular Depth in Dynamic Environment via Context-aware Temporal Attention

The monocular depth estimation task has recently revealed encouraging pr...
research
06/06/2018

Deep Ordinal Regression Network for Monocular Depth Estimation

Monocular depth estimation, which plays a crucial role in understanding ...
research
08/28/2017

A Compromise Principle in Deep Monocular Depth Estimation

Monocular depth estimation, which plays a key role in understanding 3D s...
research
03/21/2023

HRDFuse: Monocular 360°Depth Estimation by Collaboratively Learning Holistic-with-Regional Depth Distributions

Depth estimation from a monocular 360 image is a burgeoning problem owin...
research
07/11/2018

Deep attention-based classification network for robust depth prediction

In this paper, we present our deep attention-based classification (DABC)...

Please sign up or login with your details

Forgot password? Click here to reset