Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

09/20/2022
by   Pan Lu, et al.
62

When answering a question, humans utilize the information available across different modalities to synthesize a consistent and complete chain of thought (CoT). This process is normally a black box in the case of deep learning models like large-scale language models. Recently, science question benchmarks have been used to diagnose the multi-hop reasoning ability and interpretability of an AI system. However, existing datasets fail to provide annotations for the answers, or are restricted to the textual-only modality, small scales, and limited domain diversity. To this end, we present Science Question Answering (SQA), a new benchmark that consists of  21k multimodal multiple choice questions with a diverse set of science topics and annotations of their answers with corresponding lectures and explanations. We further design language models to learn to generate lectures and explanations as the chain of thought (CoT) to mimic the multi-hop reasoning process when answering SQA questions. SQA demonstrates the utility of CoT in language models, as CoT improves the question answering performance by 1.20 fine-tuned UnifiedQA. We also explore the upper bound for models to leverage explanations by feeding those in the input; we observe that it improves the few-shot performance of GPT-3 by 18.96 language models, similar to humans, benefit from explanations to learn from fewer data and achieve the same performance with just 40

READ FULL TEXT

page 4

page 5

page 16

page 18

09/07/2021

Exploiting Reasoning Chains for Multi-hop Science Question Answering

We propose a novel Chain Guided Retriever-reader (CGR) framework to mode...
07/02/2022

Rationale-Augmented Ensembles in Language Models

Recent research has shown that rationales, or step-by-step chains of tho...
03/28/2022

STaR: Bootstrapping Reasoning With Reasoning

Generating step-by-step "chain-of-thought" rationales improves language ...
09/04/2019

From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project

AI has achieved remarkable mastery over games such as Chess, Go, and Pok...
04/05/2022

Can language models learn from explanations in context?

Large language models can perform new tasks by adapting to a few in-cont...
08/30/2022

Faithful Reasoning Using Large Language Models

Although contemporary large language models (LMs) demonstrate impressive...
10/07/2020

Learning to Explain: Datasets and Models for Identifying Valid Reasoning Chains in Multihop Question-Answering

Despite the rapid progress in multihop question-answering (QA), models s...