Language-Assisted 3D Feature Learning for Semantic Scene Understanding

11/25/2022
by   Junbo Zhang, et al.
0

Learning descriptive 3D features is crucial for understanding 3D scenes with diverse objects and complex structures. However, it is usually unknown whether important geometric attributes and scene context obtain enough emphasis in an end-to-end trained 3D scene understanding network. To guide 3D feature learning toward important geometric attributes and scene context, we explore the help of textual scene descriptions. Given some free-form descriptions paired with 3D scenes, we extract the knowledge regarding the object relationships and object attributes. We then inject the knowledge to 3D feature learning through three classification-based auxiliary tasks. This language-assisted training can be combined with modern object detection and instance segmentation methods to promote 3D semantic scene understanding, especially in a label-deficient regime. Moreover, the 3D feature learned with language assistance is better aligned with the language features, which can benefit various 3D-language multimodal tasks. Experiments on several benchmarks of 3D-only and 3D-language tasks demonstrate the effectiveness of our language-assisted 3D feature learning. Code is available at https://github.com/Asterisci/Language-Assisted-3D.

READ FULL TEXT

page 7

page 13

research
08/17/2021

Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks

Instance segmentation in 3D scenes is fundamental in many applications o...
research
09/11/2023

Panoptic Vision-Language Feature Fields

Recently, methods have been proposed for 3D open-vocabulary semantic seg...
research
09/29/2022

EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual and Language Learning

3D visual grounding aims to find the objects within point clouds mention...
research
05/13/2021

Reciprocal Feature Learning via Explicit and Implicit Tasks in Scene Text Recognition

Text recognition is a popular topic for its broad applications. In this ...
research
09/18/2023

Holistic Geometric Feature Learning for Structured Reconstruction

The inference of topological principles is a key problem in structured r...
research
03/25/2022

Point2Seq: Detecting 3D Objects as Sequences

We present a simple and effective framework, named Point2Seq, for 3D obj...
research
08/11/2023

Semantic-embedded Similarity Prototype for Scene Recognition

Due to the high inter-class similarity caused by the complex composition...

Please sign up or login with your details

Forgot password? Click here to reset