Towards Multi-Sense Cross-Lingual Alignment of Contextual Embeddings

03/11/2021
by   Linlin Liu, et al.
13

Cross-lingual word embeddings (CLWE) have been proven useful in many cross-lingual tasks. However, most existing approaches to learn CLWE including the ones with contextual embeddings are sense agnostic. In this work, we propose a novel framework to align contextual embeddings at the sense level by leveraging cross-lingual signal from bilingual dictionaries only. We operationalize our framework by first proposing a novel sense-aware cross entropy loss to model word senses explicitly. The monolingual ELMo and BERT models pretrained with our sense-aware cross entropy loss demonstrate significant performance improvement for word sense disambiguation tasks. We then propose a sense alignment objective on top of the sense-aware cross entropy loss for cross-lingual model pretraining, and pretrain cross-lingual models for several language pairs (English to German/Spanish/Japanese/Chinese). Compared with the best baseline results, our cross-lingual models achieve 0.52 cross-lingual NER, sentiment classification and XNLI tasks, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2018

CLUSE: Cross-Lingual Unsupervised Sense Embeddings

This paper proposes a modularized sense induction and representation lea...
research
09/18/2019

Cross-Lingual Contextual Word Embeddings Mapping With Multi-Sense Words In Mind

Recent work in cross-lingual contextual word embedding learning cannot h...
research
04/10/2019

Cross-lingual Visual Verb Sense Disambiguation

Recent work has shown that visual context improves cross-lingual sense d...
research
04/07/2023

InfoCTM: A Mutual Information Maximization Perspective of Cross-Lingual Topic Modeling

Cross-lingual topic models have been prevalent for cross-lingual text an...
research
12/09/2020

Cross-lingual Word Sense Disambiguation using mBERT Embeddings with Syntactic Dependencies

Cross-lingual word sense disambiguation (WSD) tackles the challenge of d...
research
06/04/2023

Evolution of Efficient Symbolic Communication Codes

The paper explores how the human natural language structure can be seen ...
research
10/11/2022

Cross-Lingual Speaker Identification Using Distant Supervision

Speaker identification, determining which character said each utterance ...

Please sign up or login with your details

Forgot password? Click here to reset