Towards Reliable Image Outpainting: Learning Structure-Aware Multimodal Fusion with Depth Guidance

04/12/2022
by   Lei Zhang, et al.
0

Image outpainting technology generates visually reasonable content regardless of authenticity, making it unreliable to serve for practical applications even though introducing additional modalities eg. the sketch. Since sparse depth maps are widely captured in robotics and autonomous systems, together with RGB images, we combine the sparse depth in the image outpainting task to provide more reliable performance. Concretely, we propose a Depth-Guided Outpainting Network (DGONet) to model the feature representations of different modalities differentially and learn the structure-aware cross-modal fusion. To this end, two components are designed to implement: 1) The Multimodal Learning Module produces unique depth and RGB feature representations from the perspectives of different modal characteristics. 2) The Depth Guidance Fusion Module leverages the complete depth modality to guide the establishment of RGB contents by progressive multimodal feature fusion. Furthermore, we specially design an additional constraint strategy consisting of Cross-modal Loss and Edge Loss to enhance ambiguous contours and expedite reliable content generation. Extensive experiments on KITTI demonstrate our superiority over the state-of-the-art methods with more reliable content generation.

READ FULL TEXT

page 1

page 3

page 4

page 6

page 7

page 8

research
02/12/2022

Depth-Cooperated Trimodal Network for Video Salient Object Detection

Depth can provide useful geographical cues for salient object detection ...
research
08/23/2022

Learning an Efficient Multimodal Depth Completion Model

With the wide application of sparse ToF sensors in mobile devices, RGB i...
research
02/28/2023

RGB-D Grasp Detection via Depth Guided Learning with Cross-modal Attention

Planar grasp detection is one of the most fundamental tasks to robotic m...
research
04/21/2022

Weakly Aligned Feature Fusion for Multimodal Object Detection

To achieve accurate and robust object detection in the real-world scenar...
research
03/02/2023

Delivering Arbitrary-Modal Semantic Segmentation

Multimodal fusion can make semantic segmentation more robust. However, f...
research
04/16/2022

UAMD-Net: A Unified Adaptive Multimodal Neural Network for Dense Depth Completion

Depth prediction is a critical problem in robotics applications especial...
research
03/22/2022

DepthGAN: GAN-based Depth Generation of Indoor Scenes from Semantic Layouts

Limited by the computational efficiency and accuracy, generating complex...

Please sign up or login with your details

Forgot password? Click here to reset