3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization, and Ultra-Low Latency Acceleration

05/11/2021
by   Yao Chen, et al.
0

The deep neural network (DNN) based AI applications on the edge require both low-cost computing platforms and high-quality services. However, the limited memory, computing resources, and power budget of the edge devices constrain the effectiveness of the DNN algorithms. Developing edge-oriented AI algorithms and implementations (e.g., accelerators) is challenging. In this paper, we summarize our recent efforts for efficient on-device AI development from three aspects, including both training and inference. First, we present on-device training with ultra-low memory usage. We propose a novel rank-adaptive tensor-based tensorized neural network model, which offers orders-of-magnitude memory reduction during training. Second, we introduce an ultra-low bitwidth quantization method for DNN model compression, achieving the state-of-the-art accuracy under the same compression ratio. Third, we introduce an ultra-low latency DNN accelerator design, practicing the software/hardware co-design methodology. This paper emphasizes the importance and efficacy of training, quantization and accelerator design, and calls for more research breakthroughs in the area for AI on the edge.

READ FULL TEXT
research
04/07/2021

On-FPGA Training with Ultra Memory Reduction: A Low-Precision Tensor Method

Various hardware accelerators have been developed for energy-efficient a...
research
06/08/2023

Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference

The need to execute Deep Neural Networks (DNNs) at low latency and low p...
research
04/20/2023

ULEEN: A Novel Architecture for Ultra Low-Energy Edge Neural Networks

The deployment of AI models on low-power, real-time edge devices require...
research
06/08/2022

Memory-Oriented Design-Space Exploration of Edge-AI Hardware for XR Applications

Low-Power Edge-AI capabilities are essential for on-device extended real...
research
04/07/2021

NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function Combinational Logic

While there is a large body of research on efficient processing of deep ...
research
08/25/2021

Towards Memory-Efficient Neural Networks via Multi-Level in situ Generation

Deep neural networks (DNN) have shown superior performance in a variety ...
research
10/16/2022

Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices

In this paper, we propose a data-model-hardware tri-design framework for...

Please sign up or login with your details

Forgot password? Click here to reset