Multilingual Compositional Wikidata Questions

08/07/2021
by   Ruixiang Cui, et al.
0

Semantic parsing allows humans to leverage vast knowledge resources through natural interaction. However, parsers are mostly designed for and evaluated on English resources, such as CFQ (Keysers et al., 2020), the current standard benchmark based on English data generated from grammar rules and oriented towards Freebase, an outdated knowledge base. We propose a method for creating a multilingual, parallel dataset of question-query pairs, grounded in Wikidata, and introduce such a dataset called Compositional Wikidata Questions (CWQ). We utilize this data to train and evaluate semantic parsers for Hebrew, Kannada, Chinese and English, to better understand the current strengths and weaknesses of multilingual semantic parsing. Experiments on zero-shot cross-lingual transfer demonstrate that models fail to generate valid queries even with pretrained multilingual encoders. Our methodology, dataset and results will facilitate future research on semantic parsing in more realistic and diverse settings than has been possible with existing resources.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2021

Multilingual Neural Semantic Parsing for Low-Resourced Languages

Multilingual semantic parsing is a cost-effective method that allows a s...
research
06/20/2023

On Evaluating Multilingual Compositional Generalization with Translated Datasets

Compositional generalization allows efficient learning and human-like in...
research
08/27/2019

A survey of cross-lingual features for zero-shot cross-lingual semantic parsing

The availability of corpora to train semantic parsers in English has lea...
research
09/09/2021

Translate Fill: Improving Zero-Shot Multilingual Semantic Parsing with Synthetic Data

While multilingual pretrained language models (LMs) fine-tuned on a sing...
research
06/02/2020

BERT Based Multilingual Machine Comprehension in English and Hindi

Multilingual Machine Comprehension (MMC) is a Question-Answering (QA) su...
research
10/10/2020

Compressing Transformer-Based Semantic Parsing Models using Compositional Code Embeddings

The current state-of-the-art task-oriented semantic parsing models use B...
research
07/01/2021

Multilingual Central Repository: a Cross-lingual Framework for Developing Wordnets

Language resources are necessary for language processing,but building th...

Please sign up or login with your details

Forgot password? Click here to reset