UDFNet: Unsupervised Disparity Fusion with Adversarial Networks

by   Can Pu, et al.

Existing disparity fusion methods based on deep learning achieve state-of-the-art performance, but they require ground truth disparity data to train. As far as I know, this is the first time an unsupervised disparity fusion not using ground truth disparity data has been proposed. In this paper, a mathematical model for disparity fusion is proposed to guide an adversarial network to train effectively without ground truth disparity data. The initial disparity maps are inputted from the left view along with auxiliary information (gradient, left & right intensity image) into the refiner and the refiner is trained to output the refined disparity map registered on the left view. The refined left disparity map and left intensity image are used to reconstruct a fake right intensity image. Finally, the fake and real right intensity images (from the right stereo vision camera) are fed into the discriminator. In the model, the refiner is trained to output a refined disparity value close to the weighted sum of the disparity inputs for global initialisation. Then, three refinement principles are adopted to refine the results further. (1) The reconstructed intensity error between the fake and real right intensity image is minimised. (2) The similarities between the fake and real right image in different receptive fields are maximised. (3) The refined disparity map is smoothed based on the corresponding intensity image. The adversarial networks' architectures are effective for the fusion task. The fusion time using the proposed network is small. The network can achieve 90 fps using Nvidia Geforce GTX 1080Ti on the Kitti2015 dataset when the input resolution is 1242 * 375 (Width * Height) without downsampling and cropping. The accuracy of this work is equal to (or better than) the state-of-the-art supervised methods.


page 3

page 9

page 11


Sdf-GAN: Semi-supervised Depth Fusion with Multi-scale Adversarial Networks

Fusing disparity maps from different algorithms to exploit their complem...

Non-destructive three-dimensional measurement of hand vein based on self-supervised network

At present, supervised stereo methods based on deep neural network have ...

A Multi-modal Garden Dataset and Hybrid 3D Dense Reconstruction Framework Based on Panoramic Stereo Images for a Trimming Robot

Recovering an outdoor environment's surface mesh is vital for an agricul...

Left-Right Comparative Recurrent Model for Stereo Matching

Leveraging the disparity information from both left and right views is c...

Neural Disparity Refinement for Arbitrary Resolution Stereo

We introduce a novel architecture for neural disparity refinement aimed ...

PT-ResNet: Perspective Transformation-Based Residual Network for Semantic Road Image Segmentation

Semantic road region segmentation is a high-level task, which paves the ...

Fully Parallel Architecture for Semi-global Stereo Matching with Refined Rank Method

Fully parallel architecture at disparity-level for efficient semi-global...

Please sign up or login with your details

Forgot password? Click here to reset