ml-mkqa
None
view repo
Progress in cross-lingual modeling depends on challenging, realistic, and diverse evaluation sets. We introduce Multilingual Knowledge Questions and Answers (MKQA), an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages (260k question-answer pairs in total). The goal of this dataset is to provide a challenging benchmark for question answering quality across a wide set of languages. Answers are based on a language-independent data representation, making results comparable across languages and independent of language-specific passages. With 26 languages, this dataset supplies the widest range of languages to-date for evaluating question answering. We benchmark state-of-the-art extractive question answering baselines, trained on Natural Questions, including Multilingual BERT, and XLM-RoBERTa, in zero shot and translation settings. Results indicate this dataset is challenging, especially in low-resource languages.
READ FULL TEXT
Multilingual question answering tasks typically assume answers exist in ...
read it
Confidently making progress on multilingual modeling requires challengin...
read it
Existing methods for open-retrieval question answering in lower resource...
read it
Understanding natural language questions entails the ability to break do...
read it
Answering questions on scholarly knowledge comprising text and other
art...
read it
We propose EXAMS – a new benchmark dataset for cross-lingual and
multili...
read it
We introduce the MovieQA dataset which aims to evaluate automatic story
...
read it
None
Comments
There are no comments yet.