Improved Pretraining for Domain-specific Contextual Embedding Models

04/05/2020
by   Subendhu Rongali, et al.
0

We investigate methods to mitigate catastrophic forgetting during domain-specific pretraining of contextual embedding models such as BERT, DistilBERT, and RoBERTa. Recently proposed domain-specific models such as BioBERT, SciBERT and ClinicalBERT are constructed by continuing the pretraining phase on a domain-specific text corpus. Such pretraining is susceptible to catastrophic forgetting, where the model forgets some of the information learned in the general domain. We propose the use of two continual learning techniques (rehearsal and elastic weight consolidation) to improve domain-specific training. Our results show that models trained by our proposed approaches can better maintain their performance on the general domain tasks, and at the same time, outperform domain-specific baseline models on downstream domain tasks.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset