Sense Embedding Learning for Word Sense Induction

06/17/2016
by   Linfeng Song, et al.
0

Conventional word sense induction (WSI) methods usually represent each instance with discrete linguistic features or cooccurrence features, and train a model for each polysemous word individually. In this work, we propose to learn sense embeddings for the WSI task. In the training stage, our method induces several sense centroids (embedding) for each polysemous word. In the testing stage, our method represents each instance as a contextual vector, and induces its sense by finding the nearest sense centroid in the embedding space. The advantages of our method are (1) distributed sense vectors are taken as the knowledge representations which are trained discriminatively, and usually have better performance than traditional count-based distributional models, and (2) a general model for the whole vocabulary is jointly trained to induce sense centroids under the mutlitask learning framework. Evaluated on SemEval-2010 WSI dataset, our method outperforms all participants and most of the recent state-of-the-art methods. We further verify the two advantages by comparing with carefully designed baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2023

Word Sense Induction with Knowledge Distillation from BERT

Pre-trained contextual language models are ubiquitously employed for lan...
research
05/06/2018

Russian word sense induction by clustering averaged word embeddings

The paper reports our participation in the shared task on word sense ind...
research
04/09/2018

Efficient Graph-based Word Sense Induction by Distributional Inclusion Vector Embeddings

Word sense induction (WSI), which addresses polysemy by unsupervised dis...
research
03/15/2018

RUSSE'2018: A Shared Task on Word Sense Induction for the Russian Language

The paper describes the results of the first shared task on word sense i...
research
07/06/2017

A Simple Approach to Learn Polysemous Word Embeddings

Many NLP applications require disambiguating polysemous words. Existing ...
research
02/05/2017

Prepositions in Context

Prepositions are highly polysemous, and their variegated senses encode s...
research
10/11/2022

Word Sense Induction with Hierarchical Clustering and Mutual Information Maximization

Word sense induction (WSI) is a difficult problem in natural language pr...

Please sign up or login with your details

Forgot password? Click here to reset