Romanian Multiword Expression Detection Using Multilingual Adversarial Training and Lateral Inhibition

04/22/2023
by   Andrei-Marius Avram, et al.
0

Multiword expressions are a key ingredient for developing large-scale and linguistically sound natural language processing technology. This paper describes our improvements in automatically identifying Romanian multiword expressions on the corpus released for the PARSEME v1.2 shared task. Our approach assumes a multilingual perspective based on the recently introduced lateral inhibition layer and adversarial training to boost the performance of the employed multilingual language models. With the help of these two methods, we improve the F1-score of XLM-RoBERTa by approximately 2.7 multiword expressions, the main task of the PARSEME 1.2 edition. In addition, our results can be considered SOTA performance, as they outperform the previous results on Romanian obtained by the participants in this competition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2023

Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation

Correctly identifying multiword expressions (MWEs) is an important task ...
research
05/27/2022

HiJoNLP at SemEval-2022 Task 2: Detecting Idiomaticity of Multiword Expressions using Multilingual Pretrained Language Models

This paper describes an approach to detect idiomaticity only from the co...
research
06/07/2022

OCHADAI at SemEval-2022 Task 2: Adversarial Training for Multilingual Idiomaticity Detection

We propose a multilingual adversarial training model for determining whe...
research
06/24/2022

QAGAN: Adversarial Approach To Learning Domain Invariant Language Features

Training models that are robust to data domain shift has gained an incre...
research
11/08/2022

Detecting Euphemisms with Literal Descriptions and Visual Imagery

This paper describes our two-stage system for the Euphemism Detection sh...
research
04/17/2021

UPB at SemEval-2021 Task 5: Virtual Adversarial Training for Toxic Spans Detection

The real-world impact of polarization and toxicity in the online sphere ...
research
11/04/2020

MTLB-STRUCT @PARSEME 2020: Capturing Unseen Multiword Expressions Using Multi-task Learning and Pre-trained Masked Language Models

This paper describes a semi-supervised system that jointly learns verbal...

Please sign up or login with your details

Forgot password? Click here to reset