Semantic Specialisation of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints

06/01/2017
by   Nikola Mrkšić, et al.
0

We present Attract-Repel, an algorithm for improving the semantic quality of word vectors by injecting constraints extracted from lexical resources. Attract-Repel facilitates the use of constraints from mono- and cross-lingual resources, yielding semantically specialised cross-lingual vector spaces. Our evaluation shows that the method can make use of existing cross-lingual lexicons to construct high-quality vector spaces for a plethora of different languages, facilitating semantic transfer from high- to lower-resource ones. The effectiveness of our approach is demonstrated with state-of-the-art results on semantic similarity datasets in six languages. We next show that Attract-Repel-specialised vectors boost performance in the downstream task of dialogue state tracking (DST) across multiple languages. Finally, we show that cross-lingual vector spaces produced by our algorithm facilitate the training of multilingual DST models, which brings further performance improvements.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2021

Training Cross-Lingual embeddings for Setswana and Sepedi

African languages still lag in the advances of Natural Language Processi...
research
03/10/2020

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity

We introduce Multi-SimLex, a large-scale lexical resource and evaluation...
research
04/08/2020

Are All Good Word Vector Spaces Isomorphic?

Existing algorithms for aligning cross-lingual word vector spaces assume...
research
08/29/2019

Translate and Label! An Encoder-Decoder Approach for Cross-lingual Semantic Role Labeling

We propose a Cross-lingual Encoder-Decoder model that simultaneously tra...
research
09/11/2018

Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization

Semantic specialization is the process of fine-tuning pre-trained distri...
research
06/01/2017

Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules

Morphologically rich languages accentuate two properties of distribution...
research
11/16/2017

Addressing Cross-Lingual Word Sense Disambiguation on Low-Density Languages: Application to Persian

We explore the use of unsupervised methods in Cross-Lingual Word Sense D...

Please sign up or login with your details

Forgot password? Click here to reset