BERT WEAVER: Using WEight AVERaging to Enable Lifelong Learning for Transformer-based Models

02/21/2022
by   Lisa Langnickel, et al.
0

Recent developments in transfer learning have boosted the advancements in natural language processing tasks. The performance is, however, dependent on high-quality, manually annotated training data. Especially in the biomedical domain, it has been shown that one training corpus is not enough to learn generic models that are able to efficiently predict on new data. Therefore, state-of-the-art models need the ability of lifelong learning in order to improve performance as soon as new data are available - without the need of retraining the whole model from scratch. We present WEAVER, a simple, yet efficient post-processing method that infuses old knowledge into the new model, thereby reducing catastrophic forgetting. We show that applying WEAVER in a sequential manner results in similar word embedding distributions as doing a combined training on all data at once, while being computationally more efficient. Because there is no need of data sharing, the presented method is also easily applicable to federated learning settings and can for example be beneficial for the mining of electronic health records from different clinics.

READ FULL TEXT
research
06/16/2023

Catastrophic Forgetting in the Context of Model Updates

A large obstacle to deploying deep learning models in practice is the pr...
research
02/22/2023

Preventing Catastrophic Forgetting in Continual Learning of New Natural Language Tasks

Multi-Task Learning (MTL) is widely-accepted in Natural Language Process...
research
05/04/2022

Explain to Not Forget: Defending Against Catastrophic Forgetting with XAI

The ability to continuously process and retain new information like we d...
research
09/03/2023

Federated Orthogonal Training: Mitigating Global Catastrophic Forgetting in Continual Federated Learning

Federated Learning (FL) has gained significant attraction due to its abi...
research
10/14/2019

Federated Learning for Coalition Operations

Machine Learning in coalition settings requires combining insights avail...
research
02/03/2021

Neural Transfer Learning with Transformers for Social Science Text Analysis

During the last years, there have been substantial increases in the pred...
research
12/06/2022

CySecBERT: A Domain-Adapted Language Model for the Cybersecurity Domain

The field of cybersecurity is evolving fast. Experts need to be informed...

Please sign up or login with your details

Forgot password? Click here to reset