AutoFedNLP: An efficient FedNLP framework

05/20/2022
by   Dongqi Cai, et al.
0

Transformer-based pre-trained models have revolutionized NLP for superior performance and generality. Fine-tuning pre-trained models for downstream tasks often require private data, for which federated learning is the de-facto approach (i.e., FedNLP). However, our measurements show that FedNLP is prohibitively slow due to the large model sizes and the resultant high network/computation cost. Towards practical FedNLP, we identify as the key building blocks adapters, small bottleneck modules inserted at a variety of model layers. A key challenge is to properly configure the depth and width of adapters, to which the training speed and efficiency is highly sensitive. No silver-bullet configuration exists: the optimal choice varies across downstream NLP tasks, desired model accuracy, and client resources. A silver-bullet configuration does not exist and a non-optimal configuration could significantly slow down the training. To automate adapter configuration, we propose AutoFedNLP, a framework that enhances the existing FedNLP with two novel designs. First, AutoFedNLP progressively upgrades the adapter configuration throughout a training session. Second, AutoFedNLP continuously profiles future adapter configurations by allocating participant devices to trial groups. To minimize client-side computations, AutoFedNLP exploits the fact that a FedNLP client trains on the same samples repeatedly between consecutive changes of adapter configurations, and caches computed activations on clients. Extensive experiments show that AutoFedNLP can reduce FedNLP's model convergence delay to no more than several hours, which is up to 155.5× faster compared to vanilla FedNLP and 48× faster compared to strong baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/01/2022

AUG-FedPrompt: Practical Few-shot Federated NLP with Data-augmented Prompts

Transformer-based pre-trained models have become the de-facto solution f...
research
10/07/2022

Pre-trained Adversarial Perturbations

Self-supervised pre-training has drawn increasing attention in recent ye...
research
01/28/2023

AutoPEFT: Automatic Configuration Search for Parameter-Efficient Fine-Tuning

Large pretrained language models have been widely used in downstream NLP...
research
07/15/2020

AdapterHub: A Framework for Adapting Transformers

The current modus operandi in NLP involves downloading and fine-tuning p...
research
09/21/2022

Federated Learning from Pre-Trained Models: A Contrastive Learning Approach

Federated Learning (FL) is a machine learning paradigm that allows decen...
research
12/13/2020

MiniVLM: A Smaller and Faster Vision-Language Model

Recent vision-language (VL) studies have shown remarkable progress by le...
research
12/16/2022

FewFedWeight: Few-shot Federated Learning Framework across Multiple NLP Tasks

Massively multi-task learning with large language models has recently ma...

Please sign up or login with your details

Forgot password? Click here to reset