Efficient Fine-Tuning of BERT Models on the Edge

05/03/2022
by   Danilo Vucetic, et al.
5

Resource-constrained devices are increasingly the deployment targets of machine learning applications. Static models, however, do not always suffice for dynamic environments. On-device training of models allows for quick adaptability to new scenarios. With the increasing size of deep neural networks, as noted with the likes of BERT and other natural language processing models, comes increased resource requirements, namely memory, computation, energy, and time. Furthermore, training is far more resource intensive than inference. Resource-constrained on-device learning is thus doubly difficult, especially with large BERT-like models. By reducing the memory usage of fine-tuning, pre-trained BERT models can become efficient enough to fine-tune on resource-constrained devices. We propose Freeze And Reconfigure (FAR), a memory-efficient training regime for BERT-like models that reduces the memory usage of activation maps during fine-tuning by avoiding unnecessary parameter updates. FAR reduces fine-tuning time on the DistilBERT model and CoLA dataset by 30 metric performance on the GLUE and SQuAD datasets are around 1

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2023

SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics

Transformer-based models, such as BERT and ViT, have achieved state-of-t...
research
06/04/2021

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient

Despite superior performance on various natural language processing task...
research
07/15/2022

POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging

Fine-tuning models on edge devices like mobile phones would enable priva...
research
07/21/2022

Efficient model compression with Random Operation Access Specific Tile (ROAST) hashing

Advancements in deep learning are often associated with increasing model...
research
10/18/2020

Characterizing and Taming Model Instability Across Edge Devices

The same machine learning model running on different edge devices may pr...
research
10/10/2022

Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling

Ensembling BERT models often significantly improves accuracy, but at the...
research
09/01/2023

Efficient RLHF: Reducing the Memory Usage of PPO

Reinforcement Learning with Human Feedback (RLHF) has revolutionized lan...

Please sign up or login with your details

Forgot password? Click here to reset