AI Chat AI Image Generator AI Video Text to Speech

BERTino: an Italian DistilBERT model

03/31/2023

∙

by Matteo Muffo, et al.

∙

∙

The recent introduction of Transformers language representation models allowed great improvements in many natural language processing (NLP) tasks. However, if on one hand the performances achieved by this kind of architectures are surprising, on the other their usability is limited by the high number of parameters which constitute their network, resulting in high computational and memory demands. In this work we present BERTino, a DistilBERT model which proposes to be the first lightweight alternative to the BERT architecture specific for the Italian language. We evaluated BERTino on the Italian ISDT, Italian ParTUT, Italian WikiNER and multiclass classification tasks, obtaining F1 scores comparable to those obtained by a BERTBASE with a remarkable improvement in training and inference speed.

Matteo Muffo
3 publications
Enrico Bertino
2 publications

page 1

page 2

page 3

page 4

research

∙ 07/09/2019

UW-BHI at MEDIQA 2019: An Analysis of Representation Methods for Medical Natural Language Inference

Recent advances in distributed language modeling have led to large perfo...

0 William R. Kearns, et al. ∙

research

∙ 05/09/2020

schuBERT: Optimizing Elements of BERT

Transformers <cit.> have gradually become a key component for many state...

0 Ashish Khetan, et al. ∙

research

∙ 06/11/2019

Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks

Many state-of-the-art neural models for NLP are heavily parameterized an...

0 Yi Tay, et al. ∙

research

∙ 03/28/2019

Distilling Task-Specific Knowledge from BERT into Simple Neural Networks

In the natural language processing literature, neural networks are becom...

0 Raphael Tang, et al. ∙

research

∙ 06/06/2021

Transient Chaos in BERT

Language is an outcome of our complex and dynamic human-interactions and...

0 Katsuma Inoue, et al. ∙

research

∙ 03/03/2020

HyperEmbed: Tradeoffs Between Resources and Performance in NLP Tasks with Hyperdimensional Computing enabled Embedding of n-gram Statistics

Recent advances in Deep Learning have led to a significant performance i...

0 Pedro Alonso, et al. ∙

research

∙ 02/19/2021

Learning Dynamic BERT via Trainable Gate Variables and a Bi-modal Regularizer

The BERT model has shown significant success on various natural language...

0 Seohyeong Jeong, et al. ∙