RCT: Resource Constrained Training for Edge AI

by   Tian Huang, et al.

Neural networks training on edge terminals is essential for edge AI computing, which needs to be adaptive to evolving environment. Quantised models can efficiently run on edge devices, but existing training methods for these compact models are designed to run on powerful servers with abundant memory and energy budget. For example, quantisation-aware training (QAT) method involves two copies of model parameters, which is usually beyond the capacity of on-chip memory in edge devices. Data movement between off-chip and on-chip memory is energy demanding as well. The resource requirements are trivial for powerful servers, but critical for edge devices. To mitigate these issues, We propose Resource Constrained Training (RCT). RCT only keeps a quantised model throughout the training, so that the memory requirements for model parameters in training is reduced. It adjusts per-layer bitwidth dynamically in order to save energy when a model can learn effectively with lower precision. We carry out experiments with representative models and tasks in image application and natural language processing. Experiments show that RCT saves more than 86% energy for General Matrix Multiply (GEMM) and saves more than 46% memory for model parameters, with limited accuracy loss. Comparing with QAT-based method, RCT saves about half of energy on moving model parameters.


page 10

page 17

page 18

page 19

page 21


Adaptive Precision Training for Resource Constrained Devices

Learn in-situ is a growing trend for Edge AI. Training deep neural netwo...

Improving the Efficiency of Transformers for Resource-Constrained Devices

Transformers provide promising accuracy and have become popular and used...

Breaking the Memory Wall for AI Chip with a New Dimension

Recent advancements in deep learning have led to the widespread adoption...

A 14uJ/Decision Keyword Spotting Accelerator with In-SRAM-Computing and On Chip Learning for Customization

Keyword spotting has gained popularity as a natural way to interact with...

Towards Enabling Dynamic Convolution Neural Network Inference for Edge Intelligence

Deep learning applications have achieved great success in numerous real-...

Latency-Memory Optimized Splitting of Convolution Neural Networks for Resource Constrained Edge Devices

With the increasing reliance of users on smart devices, bringing essenti...

POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging

Fine-tuning models on edge devices like mobile phones would enable priva...