Mix3D: Out-of-Context Data Augmentation for 3D Scenes

10/05/2021
by   Alexey Nekrasov, et al.
6

We present Mix3D, a data augmentation technique for segmenting large-scale 3D scenes. Since scene context helps reasoning about object semantics, current works focus on models with large capacity and receptive fields that can fully capture the global context of an input 3D scene. However, strong contextual priors can have detrimental implications like mistaking a pedestrian crossing the street for a car. In this work, we focus on the importance of balancing global scene context and local geometry, with the goal of generalizing beyond the contextual priors in the training set. In particular, we propose a "mixing" technique which creates new training samples by combining two augmented scenes. By doing so, object instances are implicitly placed into novel out-of-context environments and therefore making it harder for models to rely on scene context alone, and instead infer semantics from local structure as well. We perform detailed analysis to understand the importance of global context, local structures and the effect of mixing scenes. In experiments, we show that models trained with Mix3D profit from a significant performance boost on indoor (ScanNet, S3DIS) and outdoor datasets (SemanticKITTI). Mix3D can be trivially used with any existing method, e.g., trained with Mix3D, MinkowskiNet outperforms all prior state-of-the-art methods by a significant margin on the ScanNet test benchmark 78.1 mIoU. Code is available at: https://nekrasov.dev/mix3d/

READ FULL TEXT

page 1

page 3

page 7

page 8

research
03/29/2021

Contextual Scene Augmentation and Synthesis via GSACNet

Indoor scene augmentation has become an emerging topic in the field of c...
research
11/26/2021

Inside Out Visual Place Recognition

Visual Place Recognition (VPR) is generally concerned with localizing ou...
research
03/02/2021

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Data augmentation is vital for deep learning neural networks. By providi...
research
12/12/2017

Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View

We present Im2Pano3D, a convolutional neural network that generates a de...
research
03/17/2022

Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans

3D object recognition has seen significant advances in recent years, sho...
research
04/06/2021

When Pigs Fly: Contextual Reasoning in Synthetic and Natural Scenes

Context is of fundamental importance to both human and machine vision – ...
research
12/23/2017

Scene-Specific Pedestrian Detection Based on Parallel Vision

As a special type of object detection, pedestrian detection in generic s...

Please sign up or login with your details

Forgot password? Click here to reset