Knowledge Graph Question Answering Leaderboard: A Community Resource to Prevent a Replication Crisis

01/20/2022
by   Aleksandr Perevalov, et al.
2

Data-driven systems need to be evaluated to establish trust in the scientific approach and its applicability. In particular, this is true for Knowledge Graph (KG) Question Answering (QA), where complex data structures are made accessible via natural-language interfaces. Evaluating the capabilities of these systems has been a driver for the community for more than ten years while establishing different KGQA benchmark datasets. However, comparing different approaches is cumbersome. The lack of existing and curated leaderboards leads to a missing global view over the research field and could inject mistrust into the results. In particular, the latest and most-used datasets in the KGQA community, LC-QuAD and QALD, miss providing central and up-to-date points of trust. In this paper, we survey and analyze a wide range of evaluation results with significant coverage of 100 publications and 98 systems from the last decade. We provide a new central and open leaderboard for any KGQA benchmark dataset as a focal point for the community - https://kgqa.github.io/leaderboard. Our analysis highlights existing problems during the evaluation of KGQA systems. Thus, we will point to possible improvements for future evaluations.

READ FULL TEXT

page 4

page 5

page 9

research
07/19/2019

A Comparative Evaluation of Visual and Natural Language Question Answering Over Linked Data

With the growing number and size of Linked Data datasets, it is crucial ...
research
09/26/2018

No One is Perfect: Analysing the Performance of Question Answering Components over the DBpedia Knowledge Graph

Question answering (QA) over knowledge graphs has gained significant mom...
research
12/07/2021

Question Answering Survey: Directions, Challenges, Datasets, Evaluation Matrices

The usage and amount of information available on the internet increase o...
research
09/20/2018

A Quantitative Evaluation of Natural Language Question Interpretation for Question Answering Systems

Systematic benchmark evaluation plays an important role in the process o...
research
03/01/2019

Data-driven Approach for Quality Evaluation on Knowledge Sharing Platform

In recent years, voice knowledge sharing and question answering (Q&A) pl...
research
01/31/2022

QALD-9-plus: A Multilingual Dataset for Question Answering over DBpedia and Wikidata Translated by Native Speakers

The ability to have the same experience for different user groups (i.e.,...

Please sign up or login with your details

Forgot password? Click here to reset