Looking at words and points with attention: a benchmark for text-to-shape coherence

09/14/2023
by   Andrea Amaduzzi, et al.
0

While text-conditional 3D object generation and manipulation have seen rapid progress, the evaluation of coherence between generated 3D shapes and input textual descriptions lacks a clear benchmark. The reason is twofold: a) the low quality of the textual descriptions in the only publicly available dataset of text-shape pairs; b) the limited effectiveness of the metrics used to quantitatively assess such coherence. In this paper, we propose a comprehensive solution that addresses both weaknesses. Firstly, we employ large language models to automatically refine textual descriptions associated with shapes. Secondly, we propose a quantitative metric to assess text-to-shape coherence, through cross-attention mechanisms. To validate our approach, we conduct a user study and compare quantitatively our metric with existing ones. The refined dataset, the new metric and a set of text-shape pairs validated by the user study comprise a novel, fine-grained benchmark that we publicly release to foster research on text-to-shape coherence of text-conditioned 3D generative models. Benchmark available at https://cvlab-unibo.github.io/CrossCoherence-Web/.

READ FULL TEXT

page 16

page 17

page 18

page 19

page 20

page 21

page 22

page 23

research
07/19/2022

ShapeCrafter: A Recursive Text-Conditioned 3D Shape Generation Model

We present ShapeCrafter, a neural network for recursive text-conditioned...
research
03/28/2022

Towards Implicit Text-Guided 3D Shape Generation

In this work, we explore the challenging task of generating 3D shapes fr...
research
11/26/2019

Text2FaceGAN: Face Generation from Fine Grained Textual Descriptions

Powerful generative adversarial networks (GAN) have been developed to au...
research
12/21/2022

3D Highlighter: Localizing Regions on 3D Shapes via Text Descriptions

We present 3D Highlighter, a technique for localizing semantic regions o...
research
03/22/2018

Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings

We present a method for generating colored 3D shapes from natural langua...
research
08/17/2023

Fine-grained Text and Image Guided Point Cloud Completion with CLIP Model

This paper focuses on the recently popular task of point cloud completio...
research
03/23/2023

TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision

In this paper, we investigate an open research task of generating contro...

Please sign up or login with your details

Forgot password? Click here to reset