CLUSE: Cross-Lingual Unsupervised Sense Embeddings

09/15/2018
by   Ta-Chung Chi, et al.
0

This paper proposes a modularized sense induction and representation learning model that jointly learns bilingual sense embeddings that align well in the vector space, where the cross-lingual signal in the English-Chinese parallel corpus is exploited to capture the collocation and distributed characteristics in the language pair. The model is evaluated on the Stanford Contextual Word Similarity (SCWS) dataset to ensure the quality of monolingual sense embeddings. In addition, we introduce Bilingual Contextual Word Similarity (BCWS), a large and high-quality dataset for evaluating cross-lingual sense embeddings, which is the first attempt of measuring whether the learned embeddings are indeed aligned well in the vector space. The proposed approach shows the superior quality of sense embeddings evaluated in both monolingual and bilingual spaces.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2021

Towards Multi-Sense Cross-Lingual Alignment of Contextual Embeddings

Cross-lingual word embeddings (CLWE) have been proven useful in many cro...
research
11/09/2016

A Comparison of Word Embeddings for English and Cross-Lingual Chinese Word Sense Disambiguation

Word embeddings are now ubiquitous forms of word representation in natur...
research
10/21/2018

BCWS: Bilingual Contextual Word Similarity

This paper introduces the first dataset for evaluating English-Chinese B...
research
03/20/2019

Distributed Vector Representations of Folksong Motifs

This article presents a distributed vector representation model for lear...
research
04/07/2020

Locality Preserving Loss to Align Vector Spaces

We present a locality preserving loss (LPL)that improves the alignment b...
research
10/24/2019

Wasserstein distances for evaluating cross-lingual embeddings

Word embeddings are high dimensional vector representations of words tha...
research
03/30/2016

Bilingual Learning of Multi-sense Embeddings with Discrete Autoencoders

We present an approach to learning multi-sense word embeddings relying b...

Please sign up or login with your details

Forgot password? Click here to reset