SimRelUz: Similarity and Relatedness scores as a Semantic Evaluation dataset for Uzbek language

05/12/2022
by   Ulugbek Salaev, et al.
0

Semantic relatedness between words is one of the core concepts in natural language processing, thus making semantic evaluation an important task. In this paper, we present a semantic model evaluation dataset: SimRelUz - a collection of similarity and relatedness scores of word pairs for the low-resource Uzbek language. The dataset consists of more than a thousand pairs of words carefully selected based on their morphological features, occurrence frequency, semantic relation, as well as annotated by eleven native Uzbek speakers from different age groups and gender. We also paid attention to the problem of dealing with rare words and out-of-vocabulary words to thoroughly evaluate the robustness of semantic models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2019

Word Similarity Datasets for Thai: Construction and Evaluation

Distributional semantics in the form of word embeddings are an essential...
research
04/19/2023

Bridging Natural Language Processing and Psycholinguistics: computationally grounded semantic similarity datasets for Basque and Spanish

We present a computationally-grounded word similarity dataset based on t...
research
04/15/2018

Introducing two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness

We present two novel datasets for the low-resource language Vietnamese t...
research
05/15/2016

A Proposal for Linguistic Similarity Datasets Based on Commonality Lists

Similarity is a core notion that is used in psychology and two branches ...
research
06/02/2023

LyricSIM: A novel Dataset and Benchmark for Similarity Detection in Spanish Song LyricS

In this paper, we present a new dataset and benchmark tailored to the ta...
research
11/03/2016

CogALex-V Shared Task: ROOT18

In this paper, we describe ROOT 18, a classifier using the scores of sev...
research
08/01/2015

Separated by an Un-common Language: Towards Judgment Language Informed Vector Space Modeling

A common evaluation practice in the vector space models (VSMs) literatur...

Please sign up or login with your details

Forgot password? Click here to reset