QALD-9-plus: A Multilingual Dataset for Question Answering over DBpedia and Wikidata Translated by Native Speakers

01/31/2022
by   Aleksandr Perevalov, et al.
0

The ability to have the same experience for different user groups (i.e., accessibility) is one of the most important characteristics of Web-based systems. The same is true for Knowledge Graph Question Answering (KGQA) systems that provide the access to Semantic Web data via natural language interface. While following our research agenda on the multilingual aspect of accessibility of KGQA systems, we identified several ongoing challenges. One of them is the lack of multilingual KGQA benchmarks. In this work, we extend one of the most popular KGQA benchmarks - QALD-9 by introducing high-quality questions' translations to 8 languages provided by native speakers, and transferring the SPARQL queries of QALD-9 from DBpedia to Wikidata, s.t., the usability and relevance of the dataset is strongly increased. Five of the languages - Armenian, Ukrainian, Lithuanian, Bashkir and Belarusian - to our best knowledge were never considered in KGQA research community before. The latter two of the languages are considered as "endangered" by UNESCO. We call the extended dataset QALD-9-plus and made it available online https://github.com/Perevalov/qald_9_plus.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2018

Towards a Question Answering System over the Semantic Web

Thanks to the development of the Semantic Web, a lot of new structured d...
research
10/04/2022

Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering

We introduce Mintaka, a complex, natural, and multilingual dataset desig...
research
11/05/2020

EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question Answering

We propose EXAMS – a new benchmark dataset for cross-lingual and multili...
research
07/02/2020

Project PIAF: Building a Native French Question-Answering Dataset

Motivated by the lack of data for non-English languages, in particular f...
research
05/13/2022

Knowledge Graph Question Answering Datasets and Their Generalizability: Are They Enough for Future Research?

Existing approaches on Question Answering over Knowledge Graphs (KGQA) h...
research
09/20/2021

Assessing the quality of sources in Wikidata across languages: a hybrid approach

Wikidata is one of the most important sources of structured data on the ...
research
01/20/2022

Knowledge Graph Question Answering Leaderboard: A Community Resource to Prevent a Replication Crisis

Data-driven systems need to be evaluated to establish trust in the scien...

Please sign up or login with your details

Forgot password? Click here to reset