Reward Engineering for Generating Semi-structured Explanation

09/15/2023
by   Jiuzhou Han, et al.
0

Semi-structured explanation depicts the implicit process of a reasoner with an explicit representation. This explanation highlights how available information in a specific query is supplemented with information a reasoner produces from its internal weights towards generating an answer. Despite the recent improvements in generative capabilities of language models, producing structured explanations to verify model's true reasoning capabilities remains a challenge. This issue is particularly pronounced for not-so-large LMs, as the reasoner is expected to couple a sequential answer with a structured explanation which embodies both the correct presentation and the correct reasoning process. In this work, we first underscore the limitations of supervised fine-tuning (SFT) in tackling this challenge, and then introduce a carefully crafted reward engineering method in reinforcement learning (RL) to better address this problem. We investigate multiple reward aggregation methods and provide a detailed discussion which sheds light on the promising potential of RL for future research. Our proposed reward on two semi-structured explanation generation benchmarks (ExplaGraph and COPA-SSE) achieves new state-of-the-art results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2023

STREET: A Multi-Task Structured Reasoning and Explanation Benchmark

We introduce STREET, a unified multi-task and multi-domain natural langu...
research
09/16/2023

Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?

Despite the power of Large Language Models (LLMs) like GPT-4, they still...
research
12/19/2022

Explanation Regeneration via Information Bottleneck

Explaining the black-box predictions of NLP models naturally and accurat...
research
10/22/2022

MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure

In this paper, we propose a comprehensive benchmark to investigate model...
research
07/01/2021

Distilling Reinforcement Learning Tricks for Video Games

Reinforcement learning (RL) research focuses on general solutions that c...
research
10/07/2020

Learning to Explain: Datasets and Models for Identifying Valid Reasoning Chains in Multihop Question-Answering

Despite the rapid progress in multihop question-answering (QA), models s...

Please sign up or login with your details

Forgot password? Click here to reset