DistilCamemBERT: a distillation of the French model CamemBERT

05/23/2022
by   Cyrile Delestre, et al.
0

Modern Natural Language Processing (NLP) models based on Transformer structures represent the state of the art in terms of performance on very diverse tasks. However, these models are complex and represent several hundred million parameters for the smallest of them. This may hinder their adoption at the industrial level, making it difficult to scale up to a reasonable infrastructure and/or to comply with societal and environmental responsibilities. To this end, we present in this paper a model that drastically reduces the computational cost of a well-known French model (CamemBERT), while preserving good performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2023

Making Small Language Models Better Multi-task Learners with Mixture-of-Task-Adapters

Recently, Large Language Models (LLMs) have achieved amazing zero-shot l...
research
05/07/2023

LatinCy: Synthetic Trained Pipelines for Latin NLP

This paper introduces LatinCy, a set of trained general purpose Latin-la...
research
09/19/2023

Classifying Organizations for Food System Ontologies using Natural Language Processing

Our research explores the use of natural language processing (NLP) metho...
research
04/19/2020

The Cost of Training NLP Models: A Concise Overview

We review the cost of training large-scale language models, and the driv...
research
09/08/2019

Transformer to CNN: Label-scarce distillation for efficient text classification

Significant advances have been made in Natural Language Processing (NLP)...
research
02/09/2022

pNLP-Mixer: an Efficient all-MLP Architecture for Language

Large pre-trained language models drastically changed the natural langua...
research
12/15/2017

Learning when to skim and when to read

Many recent advances in deep learning for natural language processing ha...

Please sign up or login with your details

Forgot password? Click here to reset