Improve Lexicon-based Word Embeddings By Word Sense Disambiguation

07/24/2017
by   Yuanzhi Ke, et al.
0

There have been some works that learn a lexicon together with the corpus to improve the word embeddings. However, they either model the lexicon separately but update the neural networks for both the corpus and the lexicon by the same likelihood, or minimize the distance between all of the synonym pairs in the lexicon. Such methods do not consider the relatedness and difference of the corpus and the lexicon, and may not be the best optimized. In this paper, we propose a novel method that considers the relatedness and difference of the corpus and the lexicon. It trains word embeddings by learning the corpus to predicate a word and its corresponding synonym under the context at the same time. For polysemous words, we use a word sense disambiguation filter to eliminate the synonyms that have different meanings for the context. To evaluate the proposed method, we compare the performance of the word embeddings trained by our proposed model, the control groups without the filter or the lexicon, and the prior works in the word similarity tasks and text classification task. The experimental results show that the proposed model provides better embeddings for polysemous words and improves the performance for text classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2016

Learning Word Sense Embeddings from Word Sense Definitions

Word embeddings play a significant role in many modern NLP systems. Sinc...
research
02/23/2023

Deep learning model for Mongolian Citizens Feedback Analysis using Word Vector Embeddings

A large amount of feedback was collected over the years. Many feedback a...
research
03/09/2017

What can you do with a rock? Affordance extraction via word embeddings

Autonomous agents must often detect affordances: the set of behaviors en...
research
07/06/2020

Reflection-based Word Attribute Transfer

Word embeddings, which often represent such analogic relations as king -...
research
04/03/2022

A Part-of-Speech Tagger for Yiddish: First Steps in Tagging the Yiddish Book Center Corpus

We describe the construction and evaluation of a part-of-speech tagger f...
research
12/02/2016

Alleviating Overfitting for Polysemous Words for Word Representation Estimation Using Lexicons

Though there are some works on improving distributed word representation...
research
02/23/2019

Fixed-Size Ordinally Forgetting Encoding Based Word Sense Disambiguation

In this paper, we present our method of using fixed-size ordinally forge...

Please sign up or login with your details

Forgot password? Click here to reset