FreDSNet: Joint Monocular Depth and Semantic Segmentation with Fast Fourier Convolutions

10/04/2022
by   Bruno Berenguel-Baeta, et al.
0

In this work we present FreDSNet, a deep learning solution which obtains semantic 3D understanding of indoor environments from single panoramas. Omnidirectional images reveal task-specific advantages when addressing scene understanding problems due to the 360-degree contextual information about the entire environment they provide. However, the inherent characteristics of the omnidirectional images add additional problems to obtain an accurate detection and segmentation of objects or a good depth estimation. To overcome these problems, we exploit convolutions in the frequential domain obtaining a wider receptive field in each convolutional layer. These convolutions allow to leverage the whole context information from omnidirectional images. FreDSNet is the first network that jointly provides monocular depth estimation and semantic segmentation from a single panoramic image exploiting fast Fourier convolutions. Our experiments show that FreDSNet has similar performance as specific state of the art methods for semantic segmentation and depth estimation. FreDSNet code is publicly available in https://github.com/Sbrunoberenguel/FreDSNet

READ FULL TEXT

page 1

page 2

page 6

research
07/29/2021

CI-Net: Contextual Information for Joint Semantic Segmentation and Depth Estimation

Monocular depth estimation and semantic segmentation are two fundamental...
research
04/25/2016

Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks

Multi-scale deep CNNs have been used successfully for problems mapping e...
research
06/02/2023

Towards In-context Scene Understanding

In-context learningx2013the ability to configure a model's behavior with...
research
05/20/2017

Recurrent Scene Parsing with Perspective Understanding in the Loop

Objects may appear at arbitrary scales in perspective images of a scene,...
research
09/06/2017

Learning Dilation Factors for Semantic Segmentation of Street Scenes

Contextual information is crucial for semantic segmentation. However, fi...
research
11/28/2018

ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network

We introduce a light-weight, power efficient, and general purpose convol...
research
07/14/2015

Lifting GIS Maps into Strong Geometric Context for Scene Understanding

Contextual information can have a substantial impact on the performance ...

Please sign up or login with your details

Forgot password? Click here to reset