Lessons Learned from Applying off-the-shelf BERT: There is no Silver Bullet

09/15/2020
by   Victor Makarenkov, et al.
0

One of the challenges in the NLP field is training large classification models, a task that is both difficult and tedious. It is even harder when GPU hardware is unavailable. The increased availability of pre-trained and off-the-shelf word embeddings, models, and modules aim at easing the process of training large models and achieving a competitive performance. We explore the use of off-the-shelf BERT models and share the results of our experiments and compare their results to those of LSTM networks and more simple baselines. We show that the complexity and computational cost of BERT is not a guarantee for enhanced predictive performance in the classification tasks at hand.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/10/2022

BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives

BERT has revolutionized the NLP field by enabling transfer learning with...
research
09/11/2020

A Comparison of LSTM and BERT for Small Corpus

Recent advancements in the NLP field showed that transfer learning helps...
research
09/19/2023

Mixed-Distil-BERT: Code-mixed Language Modeling for Bangla, English, and Hindi

One of the most popular downstream tasks in the field of Natural Languag...
research
04/06/2019

ThisIsCompetition at SemEval-2019 Task 9: BERT is unstable for out-of-domain samples

This paper describes our system, Joint Encoders for Stable Suggestion In...
research
09/24/2021

DACT-BERT: Differentiable Adaptive Computation Time for an Efficient BERT Inference

Large-scale pre-trained language models have shown remarkable results in...
research
04/08/2021

Embeddings and Attention in Predictive Modeling

We explore in depth how categorical data can be processed with embedding...

Please sign up or login with your details

Forgot password? Click here to reset