Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation

11/23/2022
by   Ning Zhang, et al.
0

Self-supervised monocular depth estimation that does not require ground-truth for training has attracted attention in recent years. It is of high interest to design lightweight but effective models, so that they can be deployed on edge devices. Many existing architectures benefit from using heavier backbones at the expense of model sizes. In this paper we achieve comparable results with a lightweight architecture. Specifically, we investigate the efficient combination of CNNs and Transformers, and design a hybrid architecture Lite-Mono. A Consecutive Dilated Convolutions (CDC) module and a Local-Global Features Interaction (LGFI) module are proposed. The former is used to extract rich multi-scale local features, and the latter takes advantage of the self-attention mechanism to encode long-range global information into the features. Experiments demonstrate that our full model outperforms Monodepth2 by a large margin in accuracy, with about 80

READ FULL TEXT

page 1

page 3

page 6

page 7

research
08/06/2022

MonoViT: Self-Supervised Monocular Depth Estimation with a Vision Transformer

Self-supervised monocular depth estimation is an attractive solution tha...
research
09/29/2022

Lightweight Monocular Depth Estimation with an Edge Guided Network

Monocular depth estimation is an important task that can be applied to m...
research
04/29/2022

SideRT: A Real-time Pure Transformer Architecture for Single Image Depth Estimation

Since context modeling is critical for estimating depth from a single im...
research
02/20/2023

GlocalFuse-Depth: Fusing Transformers and CNNs for All-day Self-supervised Monocular Depth Estimation

In recent years, self-supervised monocular depth estimation has drawn mu...
research
10/15/2021

Attention meets Geometry: Geometry Guided Spatial-Temporal Attention for Consistent Self-Supervised Monocular Depth Estimation

Inferring geometrically consistent dense 3D scenes across a tuple of tem...
research
03/06/2023

DwinFormer: Dual Window Transformers for End-to-End Monocular Depth Estimation

Depth estimation from a single image is of paramount importance in the r...
research
08/04/2023

Lightweight Endoscopic Depth Estimation with CNN-Transformer Encoder

In this study, we tackle the key challenges concerning accuracy and robu...

Please sign up or login with your details

Forgot password? Click here to reset