Deep 3D Pan via adaptive "t-shaped" convolutions with global and local adaptive dilations

Recent advances in deep learning have shown promising results in many low-level vision tasks. However, solving the single-image-based view synthesis is still an open problem. In particular, the generation of new images at parallel camera views given a single input image is of great interest, as it enables 3D visualization of the 2D input scenery. We propose a novel network architecture to perform stereoscopic view synthesis at arbitrary camera positions along the X-axis, or Deep 3D Pan, with "t-shaped" adaptive kernels equipped with globally and locally adaptive dilations. Our proposed network architecture, the monster-net, is devised with a novel "t-shaped" adaptive kernel with globally and locally adaptive dilation, which can efficiently incorporate global camera shift into and handle local 3D geometries of the target image's pixels for the synthesis of naturally looking 3D panned views when a 2-D input image is given. Extensive experiments were performed on the KITTI, CityScapes and our VICLAB_STEREO indoors dataset to prove the efficacy of our method. Our monster-net significantly outperforms the state-of-the-art method, SOTA, by a large margin in all metrics of RMSE, PSNR, and SSIM. Our proposed monster-net is capable of reconstructing more reliable image structures in synthesized images with coherent geometry. Moreover, the disparity information that can be extracted from the "t-shaped" kernel is much more reliable than that of the SOTA for the unsupervised monocular depth estimation task, confirming the effectiveness of our method.

READ FULL TEXT

page 2

page 9

page 13

page 15

page 17

page 18

page 19

page 20

research
11/17/2017

Depth Assisted Full Resolution Network for Single Image-based View Synthesis

Researches in novel viewpoint synthesis majorly focus on interpolation f...
research
09/20/2019

Deep 3D-Zoom Net: Unsupervised Learning of Photo-Realistic 3D-Zoom

The 3D-zoom operation is the positive translation of the camera in the Z...
research
03/27/2021

MINE: Towards Continuous Depth MPI with NeRF for Novel View Synthesis

In this paper, we propose MINE to perform novel view synthesis and depth...
research
03/20/2019

A Novel Monocular Disparity Estimation Network with Domain Transformation and Ambiguity Learning

Convolutional neural networks (CNN) have shown state-of-the-art results ...
research
11/07/2020

DeepCFL: Deep Contextual Features Learning from a Single Image

Recently, there is a vast interest in developing image feature learning ...
research
06/17/2021

Scale-Consistent Fusion: from Heterogeneous Local Sampling to Global Immersive Rendering

Image-based geometric modeling and novel view synthesis based on sparse,...
research
04/04/2018

Btrfly Net: Vertebrae Labelling with Energy-based Adversarial Learning of Local Spine Prior

Robust localisation and identification of vertebrae is an essential part...

Please sign up or login with your details

Forgot password? Click here to reset