SnakeVoxFormer: Transformer-based Single Image Voxel Reconstruction with Run Length Encoding

03/28/2023
by   Jae Joong Lee, et al.
0

Deep learning-based 3D object reconstruction has achieved unprecedented results. Among those, the transformer deep neural model showed outstanding performance in many applications of computer vision. We introduce SnakeVoxFormer, a novel, 3D object reconstruction in voxel space from a single image using the transformer. The input to SnakeVoxFormer is a 2D image, and the result is a 3D voxel model. The key novelty of our approach is in using the run-length encoding that traverses (like a snake) the voxel space and encodes wide spatial differences into a 1D structure that is suitable for transformer encoding. We then use dictionary encoding to convert the discovered RLE blocks into tokens that are used for the transformer. The 1D representation is a lossless 3D shape data compression method that converts to 1D data that use only about 1 strategies affect the effect of encoding and reconstruction. We compare our method with the state-of-the-art for 3D voxel reconstruction from images and our method improves the state-of-the-art methods by at least 2.8 19.8

READ FULL TEXT

page 4

page 7

research
11/04/2021

Voxel-based 3D Detection and Reconstruction of Multiple Objects from a Single Image

Inferring 3D locations and shapes of multiple objects from a single 2D i...
research
10/17/2021

3D-RETR: End-to-End Single and Multi-View 3D Reconstruction with Transformers

3D reconstruction aims to reconstruct 3D objects from 2D views. Previous...
research
06/01/2022

Extreme Floorplan Reconstruction by Structure-Hallucinating Transformer Cascades

This paper presents an extreme floorplan reconstruction task, a new benc...
research
05/17/2023

Variable Length Embeddings

In this work, we introduce a novel deep learning architecture, Variable ...
research
10/02/2020

Weight Encode Reconstruction Network for Computed Tomography in a Semi-Case-Wise and Learning-Based Way

Classic algebraic reconstruction technology (ART) for computed tomograph...
research
08/15/2016

Generative and Discriminative Voxel Modeling with Convolutional Neural Networks

When working with three-dimensional data, choice of representation is ke...
research
03/12/2020

SeqXY2SeqZ: Structure Learning for 3D Shapes by Sequentially Predicting 1D Occupancy Segments From 2D Coordinates

Structure learning for 3D shapes is vital for 3D computer vision. State-...

Please sign up or login with your details

Forgot password? Click here to reset