How much does a word weigh? Weighting word embeddings for word sense induction

05/23/2018
by   Nikolay Arefyev, et al.
0

The paper describes our participation in the first shared task on word sense induction and disambiguation for the Russian language RUSSE'2018 (Panchenko et al., 2018). For each of several dozens of ambiguous words, the participants were asked to group text fragments containing it according to the senses of this word, which were not provided beforehand, therefore the "induction" part of the task. For instance, a word "bank" and a set of text fragments (also known as "contexts") in which this word occurs, e.g. "bank is a financial institution that accepts deposits" and "river bank is a slope beside a body of water" were given. A participant was asked to cluster such contexts in the unknown in advance number of clusters corresponding to, in this case, the "company" and the "area" senses of the word "bank". The organizers proposed three evaluation datasets of varying complexity and text genres based respectively on texts of Wikipedia, Web pages, and a dictionary of the Russian language. We present two experiments: a positive and a negative one, based respectively on clustering of contexts represented as a weighted average of word embeddings and on machine translation using two state-of-the-art production neural machine translation systems. Our team showed the second best result on two datasets and the third best result on the remaining one dataset among 18 participating teams. We managed to substantially outperform competitive state-of-the-art baselines from the previous years based on sense embeddings.

READ FULL TEXT

page 1

page 4

page 9

page 12

research
03/15/2018

RUSSE'2018: A Shared Task on Word Sense Induction for the Russian Language

The paper describes the results of the first shared task on word sense i...
research
05/06/2018

Russian word sense induction by clustering averaged word embeddings

The paper reports our participation in the shared task on word sense ind...
research
10/05/2018

Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation

This paper demonstrates that word sense disambiguation (WSD) can improve...
research
06/23/2020

Combining Neural Language Models for WordSense Induction

Word sense induction (WSI) is the problem of grouping occurrences of an ...
research
06/24/2019

LIAAD at SemDeep-5 Challenge: Word-in-Context (WiC)

This paper describes the LIAAD system that was ranked second place in th...
research
12/15/2022

Using Two Losses and Two Datasets Simultaneously to Improve TempoWiC Accuracy

WSD (Word Sense Disambiguation) is the task of identifying which sense o...
research
04/24/2017

Watset: Automatic Induction of Synsets from a Graph of Synonyms

This paper presents a new graph-based approach that induces synsets usin...

Please sign up or login with your details

Forgot password? Click here to reset