AccSS3D: Accelerator for Spatially Sparse 3D DNNs

11/25/2020
by   Om Ji Omer, et al.
0

Semantic understanding and completion of real world scenes is a foundational primitive of 3D Visual perception widely used in high-level applications such as robotics, medical imaging, autonomous driving and navigation. Due to the curse of dimensionality, compute and memory requirements for 3D scene understanding grow in cubic complexity with voxel resolution, posing a huge impediment to realizing real-time energy efficient deployments. The inherent spatial sparsity present in the 3D world due to free space is fundamentally different from the channel-wise sparsity that has been extensively studied. We present ACCELERATOR FOR SPATIALLY SPARSE 3D DNNs (AccSS3D), the first end-to-end solution for accelerating 3D scene understanding by exploiting the ample spatial sparsity. As an algorithm-dataflow-architecture co-designed system specialized for spatially-sparse 3D scene understanding, AccSS3D includes novel spatial locality-aware metadata structures, a near-zero latency and spatial sparsity-aware dataflow optimizer, a surface orientation aware pointcloud reordering algorithm and a codesigned hardware accelerator for spatial sparsity that exploits data reuse through systolic and multicast interconnects. The SSpNNA accelerator core together with the 64 KB of L1 memory requires 0.92 mm2 of area in 16nm process at 1 GHz. Overall, AccSS3D achieves 16.8x speedup and a 2232x energy efficiency improvement for 3D sparse convolution compared to an Intel-i7-8700K 4-core CPU, which translates to a 11.8x end-to-end 3D semantic segmentation speedup and a 24.8x energy efficiency improvement (iso technology node)

READ FULL TEXT

page 1

page 4

page 5

page 7

page 11

research
08/18/2023

SpOctA: A 3D Sparse Convolution Accelerator with Octree-Encoding-Based Map Search and Inherent Sparsity-Aware Processing

Point-cloud-based 3D perception has attracted great attention in various...
research
04/11/2022

SATA: Sparsity-Aware Training Accelerator for Spiking Neural Networks

Spiking Neural Networks (SNNs) have gained huge attention as a potential...
research
05/06/2022

OMU: A Probabilistic 3D Occupancy Mapping Accelerator for Real-time OctoMap at the Edge

Autonomous machines (e.g., vehicles, mobile robots, drones) require soph...
research
09/01/2020

TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference

TensorDash is a hardware level technique for enabling data-parallel MAC ...
research
08/21/2017

GraphR: Accelerating Graph Processing Using ReRAM

This paper presents GRAPHR, the first ReRAM-based graph processing accel...
research
05/12/2020

Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations

Personalized recommendations are the backbone machine learning (ML) algo...
research
04/24/2023

Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design

Novel view synthesis is an essential functionality for enabling immersiv...

Please sign up or login with your details

Forgot password? Click here to reset