E-KAR: A Benchmark for Rationalizing Natural Language Analogical Reasoning

03/16/2022
by   Jiangjie Chen, et al.
11

The ability to recognize analogies is fundamental to human cognition. Existing benchmarks to test word analogy do not reveal the underneath process of analogical reasoning of neural models. Holding the belief that models capable of reasoning should be right for the right reasons, we propose a first-of-its-kind Explainable Knowledge-intensive Analogical Reasoning benchmark (E-KAR). Our benchmark consists of 1,655 (in Chinese) and 1,251 (in English) problems sourced from the Civil Service Exams, which require intensive background knowledge to solve. More importantly, we design a free-text explanation scheme to explain whether an analogy should be drawn, and manually annotate them for each and every question and candidate answer. Empirical results suggest that this benchmark is very challenging for some state-of-the-art models for both explanation generation and analogical question answering tasks, which invites further research in this area.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2023

STREET: A Multi-Task Structured Reasoning and Explanation Benchmark

We introduce STREET, a unified multi-task and multi-domain natural langu...
research
09/23/2019

Explainable High-order Visual Question Reasoning: A New Benchmark and Knowledge-routed Network

Explanation and high-order reasoning capabilities are crucial for real-w...
research
04/15/2021

ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning

Recent commonsense-reasoning tasks are typically discriminative in natur...
research
05/24/2023

Reasoning over Hierarchical Question Decomposition Tree for Explainable Question Answering

Explainable question answering (XQA) aims to answer a given question and...
research
05/22/2023

ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination

As ChatGPT and GPT-4 spearhead the development of Large Language Models ...
research
04/18/2022

StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts

Inferring spatial relations in natural language is a crucial ability an ...
research
10/13/2020

F1 is Not Enough! Models and Evaluation Towards User-Centered Explainable Question Answering

Explainable question answering systems predict an answer together with a...

Please sign up or login with your details

Forgot password? Click here to reset