BERTino: an Italian DistilBERT model

03/31/2023
by   Matteo Muffo, et al.
0

The recent introduction of Transformers language representation models allowed great improvements in many natural language processing (NLP) tasks. However, if on one hand the performances achieved by this kind of architectures are surprising, on the other their usability is limited by the high number of parameters which constitute their network, resulting in high computational and memory demands. In this work we present BERTino, a DistilBERT model which proposes to be the first lightweight alternative to the BERT architecture specific for the Italian language. We evaluated BERTino on the Italian ISDT, Italian ParTUT, Italian WikiNER and multiclass classification tasks, obtaining F1 scores comparable to those obtained by a BERTBASE with a remarkable improvement in training and inference speed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/09/2019

UW-BHI at MEDIQA 2019: An Analysis of Representation Methods for Medical Natural Language Inference

Recent advances in distributed language modeling have led to large perfo...
research
05/09/2020

schuBERT: Optimizing Elements of BERT

Transformers <cit.> have gradually become a key component for many state...
research
06/11/2019

Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks

Many state-of-the-art neural models for NLP are heavily parameterized an...
research
03/28/2019

Distilling Task-Specific Knowledge from BERT into Simple Neural Networks

In the natural language processing literature, neural networks are becom...
research
06/06/2021

Transient Chaos in BERT

Language is an outcome of our complex and dynamic human-interactions and...
research
03/03/2020

HyperEmbed: Tradeoffs Between Resources and Performance in NLP Tasks with Hyperdimensional Computing enabled Embedding of n-gram Statistics

Recent advances in Deep Learning have led to a significant performance i...
research
02/19/2021

Learning Dynamic BERT via Trainable Gate Variables and a Bi-modal Regularizer

The BERT model has shown significant success on various natural language...

Please sign up or login with your details

Forgot password? Click here to reset