Unsupervised Post-processing of Word Vectors via Conceptor Negation

11/17/2018
by   Tianlin Liu, et al.
0

Word vectors are at the core of many natural language processing tasks. Recently, there has been interest in post-processing word vectors to enrich their semantic information. In this paper, we introduce a novel word vector post-processing technique based on matrix conceptors (Jaeger2014), a family of regularized identity maps. More concretely, we propose to use conceptors to suppress those latent features of word vectors having high variances. The proposed method is purely unsupervised: it does not rely on any corpus or external linguistic database. We evaluate the post-processed word vectors on a battery of intrinsic lexical evaluation tasks, showing that the proposed method consistently outperforms existing state-of-the-art alternatives. We also show that post-processed word vectors can be used for the downstream natural language processing task of dialogue state tracking, yielding improved results in different dialogue domains.

READ FULL TEXT
research
05/08/2018

Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources

Word vector specialisation (also known as retrofitting) is a portable, l...
research
11/24/2019

Causally Denoise Word Embeddings Using Half-Sibling Regression

Distributional representations of words, also known as word vectors, hav...
research
03/07/2017

Building a Syllable Database to Solve the Problem of Khmer Word Segmentation

Word segmentation is a basic problem in natural language processing. Wit...
research
08/20/2018

Post-Processing of Word Representations via Variance Normalization and Dynamic Embedding

Although embedded vector representations of words offer impressive perfo...
research
05/27/2019

An Empirical Study on Post-processing Methods for Word Embeddings

Word embeddings learnt from large corpora have been adopted in various a...
research
03/02/2016

Counter-fitting Word Vectors to Linguistic Constraints

In this work, we present a novel counter-fitting method which injects an...
research
11/16/2019

AttaCut: A Fast and Accurate Neural Thai Word Segmenter

Word segmentation is a fundamental pre-processing step for Thai Natural ...

Please sign up or login with your details

Forgot password? Click here to reset