Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture

11/18/2014
by   David Eigen, et al.
0

In this paper we address three different computer vision tasks using a single basic architecture: depth prediction, surface normal estimation, and semantic labeling. We use a multiscale convolutional network that is able to adapt easily to each task using only small modifications, regressing from the input image to the output map directly. Our method progressively refines predictions using a sequence of scales, and captures many image details without any superpixels or low-level segmentation. We achieve state-of-the-art performance on benchmarks for all three tasks.

READ FULL TEXT

page 5

page 7

page 8

research
11/18/2014

Designing Deep Networks for Surface Normal Estimation

In the past few years, convolutional neural nets (CNN) have shown incred...
research
04/15/2016

Improving the Robustness of Deep Neural Networks via Stability Training

In this paper we address the issue of output instability of deep neural ...
research
02/21/2017

PixelNet: Representation of the pixels, by the pixels, and for the pixels

We explore design principles for general pixel-level prediction problems...
research
04/18/2016

Pixel-level Encoding and Depth Layering for Instance-level Semantic Labeling

Recent approaches for instance-aware semantic labeling have augmented co...
research
05/03/2018

Pixel-wise Attentional Gating for Parsimonious Pixel Labeling

To achieve parsimonious inference in per-pixel labeling tasks with a lim...
research
10/11/2021

Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans

This paper introduces a pipeline to parametrically sample and render mul...
research
04/12/2016

Fast Object Localization Using a CNN Feature Map Based Multi-Scale Search

Object localization is an important task in computer vision but requires...

Please sign up or login with your details

Forgot password? Click here to reset