Impact of Subword Pooling Strategy on Cross-lingual Event Detection

02/22/2023
by   Shantanu Agarwal, et al.
0

Pre-trained multilingual language models (e.g., mBERT, XLM-RoBERTa) have significantly advanced the state-of-the-art for zero-shot cross-lingual information extraction. These language models ubiquitously rely on word segmentation techniques that break a word into smaller constituent subwords. Therefore, all word labeling tasks (e.g. named entity recognition, event detection, etc.), necessitate a pooling strategy that takes the subword representations as input and outputs a representation for the entire word. Taking the task of cross-lingual event detection as a motivating example, we show that the choice of pooling strategy can have a significant impact on the target language performance. For example, the performance varies by up to 16 absolute f_1 points depending on the pooling strategy when training in English and testing in Arabic on the ACE task. We carry out our analysis with five different pooling strategies across nine languages in diverse multi-lingual datasets. Across configurations, we find that the canonical strategy of taking just the first subword to represent the entire word is usually sub-optimal. On the other hand, we show that attention pooling is robust to language and dataset variations by being either the best or close to the optimal strategy. For reproducibility, we make our code available at https://github.com/isi-boston/ed-pooling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2021

Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction

Zero-shot cross-lingual information extraction (IE) describes the constr...
research
01/30/2020

Do We Need Word Order Information for Cross-lingual Sequence Labeling

Most of the recent work in cross-lingual adaptation does not consider th...
research
03/15/2022

Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction

We present a study on leveraging multilingual pre-trained generative lan...
research
07/25/2023

Combating the Curse of Multilinguality in Cross-Lingual WSD by Aligning Sparse Contextualized Word Representations

In this paper, we advocate for using large pre-trained monolingual langu...
research
02/22/2021

Subword Pooling Makes a Difference

Contextual word-representations became a standard in modern natural lang...
research
10/13/2020

XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization

The ability to correctly model distinct meanings of a word is crucial fo...
research
03/03/2020

multi-patch aggregation models for resampling detection

Images captured nowadays are of varying dimensions with smartphones and ...

Please sign up or login with your details

Forgot password? Click here to reset