Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization

09/11/2018
by   Edoardo Maria Ponti, et al.
0

Semantic specialization is the process of fine-tuning pre-trained distributional word vectors using external lexical knowledge (e.g., WordNet) to accentuate a particular semantic relation in the specialized vector space. While post-processing specialization methods are applicable to arbitrary distributional vectors, they are limited to updating only the vectors of words occurring in external lexicons (i.e., seen words), leaving the vectors of all other words unchanged. We propose a novel approach to specializing the full distributional vocabulary. Our adversarial post-specialization method propagates the external lexical knowledge to the full distributional space. We exploit words seen in the resources as training examples for learning a global specialization function. This function is learned by combining a standard L2-distance loss with an adversarial loss: the adversarial component produces more realistic output vectors. We show the effectiveness and robustness of the proposed method across three languages and on three tasks: word similarity, dialog state tracking, and lexical simplification. We report consistent improvements over distributional word vectors and vectors specialized by other state-of-the-art specialization frameworks. Finally, we also propose a cross-lingual transfer method for zero-shot specialization which successfully specializes a full target distributional space without any lexical knowledge in the target language and without any bilingual data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2018

Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources

Word vector specialisation (also known as retrofitting) is a portable, l...
research
06/01/2017

Semantic Specialisation of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints

We present Attract-Repel, an algorithm for improving the semantic qualit...
research
10/17/2017

Specialising Word Vectors for Lexical Entailment

We present LEAR (Lexical Entailment Attract-Repel), a novel post-process...
research
11/15/2022

SexWEs: Domain-Aware Word Embeddings via Cross-lingual Semantic Specialisation for Chinese Sexism Detection in Social Media

The goal of sexism detection is to mitigate negative online content targ...
research
12/20/2014

Improving zero-shot learning by mitigating the hubness problem

The zero-shot paradigm exploits vector-based word representations extrac...
research
11/12/2018

Unseen Word Representation by Aligning Heterogeneous Lexical Semantic Spaces

Word embedding techniques heavily rely on the abundance of training data...
research
06/01/2017

Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules

Morphologically rich languages accentuate two properties of distribution...

Please sign up or login with your details

Forgot password? Click here to reset