End-to-end depth from motion with stabilized monocular videos

by   Clément Pinard, et al.

We propose a depth map inference system from monocular videos based on a novel dataset for navigation that mimics aerial footage from gimbal stabilized monocular camera in rigid scenes. Unlike most navigation datasets, the lack of rotation implies an easier structure from motion problem which can be leveraged for different kinds of tasks such as depth inference and obstacle avoidance. We also propose an architecture for end-to-end depth inference with a fully convolutional network. Results show that although tied to camera inner parameters, the problem is locally solvable and leads to good quality depth prediction.


page 3

page 4

page 5

page 6


N-QGN: Navigation Map from a Monocular Camera using Quadtree Generating Networks

Monocular depth estimation has been a popular area of research for sever...

MobileDepth: Efficient Monocular Depth Prediction on Mobile Devices

Depth prediction is fundamental for many useful applications on computer...

Learning structure-from-motionfrom motion

This work is based on a questioning of the quality metrics used by deep ...

DeMoN: Depth and Motion Network for Learning Monocular Stereo

In this paper we formulate structure from motion as a learning problem. ...

Instance-wise Depth and Motion Learning from Monocular Videos

We present an end-to-end joint training framework that explicitly models...

Code Repositories


PyTorch DepthNet Training on Still Box dataset

view repo

Please sign up or login with your details

Forgot password? Click here to reset