Cross-Lingual Transfer Learning for Complex Word Identification

10/02/2020
by   George-Eduard Zaharia, et al.
0

Complex Word Identification (CWI) is a task centered on detecting hard-to-understand words, or groups of words, in texts from different areas of expertise. The purpose of CWI is to highlight problematic structures that non-native speakers would usually find difficult to understand. Our approach uses zero-shot, one-shot, and few-shot learning techniques, alongside state-of-the-art solutions for Natural Language Processing (NLP) tasks (i.e., Transformers). Our aim is to provide evidence that the proposed models can learn the characteristics of complex words in a multilingual environment by relying on the CWI shared task 2018 dataset available for four different languages (i.e., English, German, Spanish, and also French). Our approach surpasses state-of-the-art cross-lingual results in terms of macro F1-score on English (0.774), German (0.782), and Spanish (0.734) languages, for the zero-shot learning scenario. At the same time, our model also outperforms the state-of-the-art monolingual result for German (0.795 macro F1-score).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2020

Cross-lingual Inductive Transfer to Detect Offensive Language

With the growing use of social media and its availability, many instance...
research
10/30/2018

Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model

Neural language models have been shown to achieve an impressive level of...
research
08/03/2022

Cross-lingual Approaches for the Detection of Adverse Drug Reactions in German from a Patient's Perspective

In this work, we present the first corpus for German Adverse Drug Reacti...
research
01/24/2023

Cross-lingual German Biomedical Information Extraction: from Zero-shot to Human-in-the-Loop

This paper presents our project proposal for extracting biomedical infor...
research
11/22/2022

Coreference Resolution through a seq2seq Transition-Based System

Most recent coreference resolution systems use search algorithms over po...
research
06/11/2023

EaSyGuide : ESG Issue Identification Framework leveraging Abilities of Generative Large Language Models

This paper presents our participation in the FinNLP-2023 shared task on ...
research
04/26/2021

Non-Parametric Few-Shot Learning for Word Sense Disambiguation

Word sense disambiguation (WSD) is a long-standing problem in natural la...

Please sign up or login with your details

Forgot password? Click here to reset