Multilingual Controllable Transformer-Based Lexical Simplification

07/05/2023
by   Kim Cheng Sheang, et al.
0

Text is by far the most ubiquitous source of knowledge and information and should be made easily accessible to as many people as possible; however, texts often contain complex words that hinder reading comprehension and accessibility. Therefore, suggesting simpler alternatives for complex words without compromising meaning would help convey the information to a broader audience. This paper proposes mTLS, a multilingual controllable Transformer-based Lexical Simplification (LS) system fined-tuned with the T5 model. The novelty of this work lies in the use of language-specific prefixes, control tokens, and candidates extracted from pre-trained masked language models to learn simpler alternatives for complex words. The evaluation results on three well-known LS datasets – LexMTurk, BenchLS, and NNSEval – show that our model outperforms the previous state-of-the-art models like LSBert and ConLS. Moreover, further evaluation of our approach on the part of the recent TSAR-2022 multilingual LS shared-task dataset shows that our model performs competitively when compared with the participating systems for English LS and even outperforms the GPT-3 model on several metrics. Moreover, our model obtains performance gains also for Spanish and Portuguese.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2023

Controllable Lexical Simplification for English

Fine-tuning Transformer-based approaches have recently shown exciting re...
research
05/17/2019

Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language

The paper introduces methods of adaptation of multilingual masked langua...
research
10/22/2020

mT5: A massively multilingual pre-trained text-to-text transformer

The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified ...
research
10/10/2019

Multilingual Question Answering from Formatted Text applied to Conversational Agents

Recent advances in NLP with language models such as BERT, GPT-2, XLNet o...
research
12/07/2020

Improvements and Extensions on Metaphor Detection

Metaphors are ubiquitous in human language. The metaphor detection task ...
research
07/31/2023

Multilingual context-based pronunciation learning for Text-to-Speech

Phonetic information and linguistic knowledge are an essential component...
research
12/19/2022

(Psycho-)Linguistic Features Meet Transformer Models for Improved Explainable and Controllable Text Simplification

State-of-the-art text simplification (TS) systems adopt end-to-end neura...

Please sign up or login with your details

Forgot password? Click here to reset