Why Not Simply Translate? A First Swedish Evaluation Benchmark for Semantic Similarity

09/07/2020
by   Tim Isbister, et al.
0

This paper presents the first Swedish evaluation benchmark for textual semantic similarity. The benchmark is compiled by simply running the English STS-B dataset through the Google machine translation API. This paper discusses potential problems with using such a simple approach to compile a Swedish evaluation benchmark, including translation errors, vocabulary variation, and productive compounding. Despite some obvious problems with the resulting dataset, we use the benchmark to compare the majority of the currently existing Swedish text representations, demonstrating that native models outperform multilingual ones, and that simple bag of words performs remarkably well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2018

Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual Semantic Relatedness Using Machine Translation

This paper provides a comparative analysis of the performance of four st...
research
02/28/2023

An evaluation of Google Translate for Sanskrit to English translation via sentiment and semantic analysis

Google Translate has been prominent for language translation; however, l...
research
06/24/2011

Translation of Pronominal Anaphora between English and Spanish: Discrepancies and Evaluation

This paper evaluates the different tasks carried out in the translation ...
research
09/27/2022

mRobust04: A Multilingual Version of the TREC Robust 2004 Benchmark

Robust 2004 is an information retrieval benchmark whose large number of ...
research
07/31/2017

SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation

Semantic Textual Similarity (STS) measures the meaning similarity of sen...
research
05/27/2011

Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language

This article presents a measure of semantic similarity in an IS-A taxono...
research
05/04/2014

Analysis Tool for UNL-Based Knowledge Representation

The fundamental issue in knowledge representation is to provide a precis...

Please sign up or login with your details

Forgot password? Click here to reset