pNLP-Mixer: an Efficient all-MLP Architecture for Language

02/09/2022
by   Francesco Fusco, et al.
0

Large pre-trained language models drastically changed the natural language processing(NLP) landscape. Nowadays, they represent the go-to framework to tackle diverse NLP tasks, even with a limited number of annotations. However, using those models in production, either in the cloud or at the edge, remains a challenge due to the memory footprint and/or inference costs. As an alternative, recent work on efficient NLP has shown that small weight-efficient models can reach competitive performance at a fraction of the costs. Here, we introduce pNLP-Mixer, an embbedding-free model based on the MLP-Mixer architecture that achieves high weight-efficiency thanks to a novel linguistically informed projection layer. We evaluate our model on two multi-lingual semantic parsing datasets, MTOP and multiATIS. On MTOP our pNLP-Mixer almost matches the performance of mBERT, which has 38 times more parameters, and outperforms the state-of-the-art of tiny models (pQRNN) with 3 times fewer parameters. On a long-sequence classification task (Hyperpartisan) our pNLP-Mixer without pretraining outperforms RoBERTa, which has 100 times more parameters, demonstrating the potential of this architecture.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2021

Small-Bench NLP: Benchmark for small single GPU trained models in Natural Language Processing

Recent progress in the Natural Language Processing domain has given us s...
research
01/26/2021

Spark NLP: Natural Language Understanding at Scale

Spark NLP is a Natural Language Processing (NLP) library built on top of...
research
12/23/2021

ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

Pre-trained language models have achieved state-of-the-art results in va...
research
02/14/2023

Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models

Pretrained large language models have become indispensable for solving v...
research
09/26/2021

Paradigm Shift in Natural Language Processing

In the era of deep learning, modeling for most NLP tasks has converged t...
research
06/20/2023

Exploring New Frontiers in Agricultural NLP: Investigating the Potential of Large Language Models for Food Applications

This paper explores new frontiers in agricultural natural language proce...
research
05/23/2022

DistilCamemBERT: a distillation of the French model CamemBERT

Modern Natural Language Processing (NLP) models based on Transformer str...

Please sign up or login with your details

Forgot password? Click here to reset