Multilingual Sequence-to-Sequence Models for Hebrew NLP

12/19/2022
by   Matan Eyal, et al.
3

Recent work attributes progress in NLP to large language models (LMs) with increased model size and large quantities of pretraining data. Despite this, current state-of-the-art LMs for Hebrew are both under-parameterized and under-trained compared to LMs in other languages. Additionally, previous work on pretrained Hebrew LMs focused on encoder-only models. While the encoder-only architecture is beneficial for classification tasks, it does not cater well for sub-word prediction tasks, such as Named Entity Recognition, when considering the morphologically rich nature of Hebrew. In this paper we argue that sequence-to-sequence generative architectures are more suitable for LLMs in the case of morphologically rich languages (MRLs) such as Hebrew. We demonstrate that by casting tasks in the Hebrew NLP pipeline as text-to-text tasks, we can leverage powerful multilingual, pretrained sequence-to-sequence models as mT5, eliminating the need for a specialized, morpheme-based, separately fine-tuned decoder. Using this approach, our experiments show substantial improvements over previously published results on existing Hebrew NLP benchmarks. These results suggest that multilingual sequence-to-sequence models present a promising building block for NLP for MRLs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/28/2022

Sequence to sequence pretraining for a less-resourced Slovenian language

Large pretrained language models have recently conquered the area of nat...
research
08/18/2021

De-identification of Unstructured Clinical Texts from Sequence to Sequence Perspective

In this work, we propose a novel problem formulation for de-identificati...
research
04/04/2020

Conversational Question Reformulation via Sequence-to-Sequence Architectures and Pretrained Language Models

This paper presents an empirical study of conversational question reform...
research
03/25/2023

Indian Language Summarization using Pretrained Sequence-to-Sequence Models

The ILSUM shared task focuses on text summarization for two major Indian...
research
10/06/2021

Sequence-to-Sequence Lexical Normalization with Multilingual Transformers

Current benchmark tasks for natural language processing contain text tha...
research
03/15/2022

Hyperdecoders: Instance-specific decoders for multi-task NLP

We investigate input-conditioned hypernetworks for multi-tasking in NLP,...
research
03/14/2020

Document Ranking with a Pretrained Sequence-to-Sequence Model

This work proposes a novel adaptation of a pretrained sequence-to-sequen...

Please sign up or login with your details

Forgot password? Click here to reset