Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices

02/12/2021
by   Yuhong Song, et al.
8

A pruning-based AutoML framework for run-time reconfigurability, namely RT3, is proposed in this work. This enables Transformer-based large Natural Language Processing (NLP) models to be efficiently executed on resource-constrained mobile devices and reconfigured (i.e., switching models for dynamic hardware conditions) at run-time. Such reconfigurability is the key to save energy for battery-powered mobile devices, which widely use dynamic voltage and frequency scaling (DVFS) technique for hardware reconfiguration to prolong battery life. In this work, we creatively explore a hybrid block-structured pruning (BP) and pattern pruning (PP) for Transformer-based models and first attempt to combine hardware and software reconfiguration to maximally save energy for battery-powered mobile devices. Specifically, RT3 integrates two-level optimizations: First, it utilizes an efficient BP as the first-step compression for resource-constrained mobile devices; then, RT3 heuristically generates a shrunken search space based on the first level optimization and searches multiple pattern sets with diverse sparsity for PP via reinforcement learning to support lightweight software reconfiguration, which corresponds to available frequency levels of DVFS (i.e., hardware reconfiguration). At run-time, RT3 can switch the lightweight pattern sets within 45ms to guarantee the required real-time constraint at different frequency levels. Results further show that RT3 can prolong battery life over 4x improvement with less than 1 loss for Transformer and 1.5

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

research
11/09/2017

How Long Will My Phone Battery Last?

Mobile devices are only as useful as their battery lasts. Unfortunately,...
research
05/26/2020

Explore Training of Deep Convolutional Neural Networks on Battery-powered Mobile Devices: Design and Application

The fast-growing smart applications on mobile devices leverage pre-train...
research
03/14/2020

CoCoPIE: Making Mobile AI Sweet As PIE –Compression-Compilation Co-Design Goes a Long Way

Assuming hardware is the major constraint for enabling real-time mobile ...
research
09/01/2021

Architecture Aware Latency Constrained Sparse Neural Networks

Acceleration of deep neural networks to meet a specific latency constrai...
research
07/17/2021

Dynamic Transformer for Efficient Machine Translation on Embedded Devices

The Transformer architecture is widely used for machine translation task...
research
04/29/2022

Energy Minimization for Federated Asynchronous Learning on Battery-Powered Mobile Devices via Application Co-running

Energy is an essential, but often forgotten aspect in large-scale federa...
research
11/09/2020

FUN! Fast, Universal, Non-Semantic Speech Embeddings

Learned speech representations can drastically improve performance on ta...

Please sign up or login with your details

Forgot password? Click here to reset