VANiLLa : Verbalized Answers in Natural Language at Large Scale

05/24/2021
by   Debanjali Biswas, et al.
11

In the last years, there have been significant developments in the area of Question Answering over Knowledge Graphs (KGQA). Despite all the notable advancements, current KGQA datasets only provide the answers as the direct output result of the formal query, rather than full sentences incorporating question context. For achieving coherent answers sentence with the question's vocabulary, template-based verbalization so are usually employed for a better representation of answers, which in turn require extensive expert intervention. Thus, making way for machine learning approaches; however, there is a scarcity of datasets that empower machine learning models in this area. Hence, we provide the VANiLLa dataset which aims at reducing this gap by offering answers in natural language sentences. The answer sentences in this dataset are syntactically and semantically closer to the question than to the triple fact. Our dataset consists of over 100k simple questions adapted from the CSQA and SimpleQuestionsWikidata datasets and generated using a semi-automatic framework. We also present results of training our dataset on multiple baseline models adapted from current state-of-the-art Natural Language Generation (NLG) architectures. We believe that this dataset will allow researchers to focus on finding suitable methodologies and architectures for answer verbalization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/25/2021

PerCQA: Persian Community Question Answering Dataset

Community Question Answering (CQA) forums provide answers for many real-...
research
08/13/2022

An Answer Verbalization Dataset for Conversational Question Answerings over Knowledge Graphs

We introduce a new dataset for conversational question answering over Kn...
research
03/13/2021

ParaQA: A Question Answering Dataset with Paraphrase Responses for Single-Turn Conversation

This paper presents ParaQA, a question answering (QA) dataset with multi...
research
10/23/2019

A Novel Approach for Automatic Bengali Question Answering System using Semantic Similarity Analysis

Finding the semantically accurate answer is one of the key challenges in...
research
06/02/2020

Question Answering on Scholarly Knowledge Graphs

Answering questions on scholarly knowledge comprising text and other art...
research
04/20/2022

Clotho-AQA: A Crowdsourced Dataset for Audio Question Answering

Audio question answering (AQA) is a multimodal translation task where a ...
research
12/26/2021

New Methods Metrics for LFQA tasks

Long-form question answering (LFQA) tasks require retrieving the documen...

Please sign up or login with your details

Forgot password? Click here to reset