MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems

05/23/2023
by   Jakub Macina, et al.
0

Although automatic dialogue tutors hold great potential in making education personalized and more accessible, research on such systems has been hampered by a lack of sufficiently large and high-quality datasets. However, collecting such datasets remains challenging, as recording tutoring sessions raises privacy concerns and crowdsourcing leads to insufficient data quality. To address this problem, we propose a framework to semi-synthetically generate such dialogues by pairing real teachers with a large language model (LLM) scaffolded to represent common student errors. In this paper, we describe our ongoing efforts to use this framework to collect MathDial, a dataset of currently ca. 1.5k tutoring dialogues grounded in multi-step math word problems. We show that our dataset exhibits rich pedagogical properties, focusing on guiding students using sense-making questions to let them explore problems. Moreover, we outline that MathDial and its grounding annotations can be used to finetune language models to be more effective tutors (and not just solvers) and highlight remaining challenges that need to be addressed by the research community. We will release our dataset publicly to foster research in this socially important area of NLP.

READ FULL TEXT

page 1

page 6

page 7

research
04/18/2021

A recipe for annotating grounded clarifications

In order to interpret the communicative intents of an utterance, it need...
research
05/15/2023

SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation

Dialogue segmentation is a crucial task for dialogue systems allowing a ...
research
07/19/2023

DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI

Despite advancements in conversational AI, language models encounter cha...
research
11/23/2022

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

Socratic questioning is an educational method that allows students to di...
research
09/16/2021

Transferable Persona-Grounded Dialogues via Grounded Minimal Edits

Grounded dialogue models generate responses that are grounded on certain...
research
04/18/2023

Enhancing Textbooks with Visuals from the Web for Improved Learning

Textbooks are the primary vehicle for delivering quality education to st...
research
08/16/2018

SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

Given a partial description like "she opened the hood of the car," human...

Please sign up or login with your details

Forgot password? Click here to reset