Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings

03/22/2018
by   Kevin Chen, et al.
4

We present a method for generating colored 3D shapes from natural language. To this end, we first learn joint embeddings of freeform text descriptions and colored 3D shapes. Our model combines and extends learning by association and metric learning approaches to learn implicit cross-modal connections, and produces a joint representation that captures the many-to-many relations between language and physical properties of 3D shapes such as color and shape. To evaluate our approach, we collect a large dataset of natural language descriptions for physical 3D objects in the ShapeNet dataset. With this learned joint embedding we demonstrate text-to-shape retrieval that outperforms baseline approaches. Using our embeddings with a novel conditional Wasserstein GAN framework, we generate colored 3D shapes from text. Our method is the first to connect natural language text with realistic 3D objects exhibiting rich variations in color, texture, and shape detail. See video at https://youtu.be/zraPvRdl13Q

READ FULL TEXT

page 13

page 20

page 22

page 26

page 27

page 28

page 30

page 31

research
01/22/2019

Generation High resolution 3D model from natural language by Generative Adversarial Network

We present a method of generating high resolution 3D shapes from natural...
research
11/28/2021

Natural Language and Spatial Rules

We develop a system that formally represents spatial semantics concepts ...
research
04/04/2019

ExCL: Extractive Clip Localization Using Natural Language Descriptions

The task of retrieving clips within videos based on a given natural lang...
research
08/03/2020

Describing Textures using Natural Language

Textures in natural images can be characterized by color, shape, periodi...
research
06/19/2023

Generating Parametric BRDFs from Natural Language Descriptions

Artistic authoring of 3D environments is a laborious enterprise that als...
research
09/14/2023

Looking at words and points with attention: a benchmark for text-to-shape coherence

While text-conditional 3D object generation and manipulation have seen r...
research
07/04/2018

Encoding Spatial Relations from Natural Language

Natural language processing has made significant inroads into learning t...

Please sign up or login with your details

Forgot password? Click here to reset