RUSSE: The First Workshop on Russian Semantic Similarity

03/15/2018
by   Alexander Panchenko, et al.
0

The paper gives an overview of the Russian Semantic Similarity Evaluation (RUSSE) shared task held in conjunction with the Dialogue 2015 conference. There exist a lot of comparative studies on semantic similarity, yet no analysis of such measures was ever performed for the Russian language. Exploring this problem for the Russian language is even more interesting, because this language has features, such as rich morphology and free word order, which make it significantly different from English, German, and other well-studied languages. We attempt to bridge this gap by proposing a shared task on the semantic similarity of Russian nouns. Our key contribution is an evaluation methodology based on four novel benchmark datasets for the Russian language. Our analysis of the 105 submissions from 19 teams reveals that successful approaches for English, such as distributional and skip-gram models, are directly applicable to Russian as well. On the one hand, the best results in the contest were obtained by sophisticated supervised models that combine evidence from different sources. On the other hand, completely unsupervised approaches, such as a skip-gram model estimated on a large-scale corpus, were able score among the top 5 systems.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

page 10

page 15

page 16

08/31/2017

Human and Machine Judgements for Russian Semantic Relatedness

Semantic relatedness of terms represents similarity of meaning by a nume...
02/28/2016

Gibberish Semantics: How Good is Russian Twitter in Word Semantic Similarity Task?

The most studied and most successful language models were developed and ...
04/30/2015

Texts in, meaning out: neural language models in semantic similarity task for Russian

Distributed vector representations for natural language vocabulary get a...
04/23/2018

Can Eye Movement Data Be Used As Ground Truth For Word Embeddings Evaluation?

In recent years a certain success in the task of modeling lexical semant...
04/05/2017

CompiLIG at SemEval-2017 Task 1: Cross-Language Plagiarism Detection Methods for Semantic Textual Similarity

We present our submitted systems for Semantic Textual Similarity (STS) T...
04/15/2018

Introducing two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness

We present two novel datasets for the low-resource language Vietnamese t...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.