The BEA 2023 Shared Task on Generating AI Teacher Responses in Educational Dialogues

06/12/2023
by   Anaïs Tack, et al.
0

This paper describes the results of the first shared task on the generation of teacher responses in educational dialogues. The goal of the task was to benchmark the ability of generative language models to act as AI teachers, replying to a student in a teacher-student dialogue. Eight teams participated in the competition hosted on CodaLab. They experimented with a wide variety of state-of-the-art models, including Alpaca, Bloom, DialoGPT, DistilGPT-2, Flan-T5, GPT-2, GPT-3, GPT- 4, LLaMA, OPT-2.7B, and T5-base. Their submissions were automatically scored using BERTScore and DialogRPT metrics, and the top three among them were further manually evaluated in terms of pedagogical ability based on Tack and Piech (2022). The NAISTeacher system, which ranked first in both automated and human evaluation, generated responses with GPT-3.5 using an ensemble of prompts and a DialogRPT-based ranking of responses for given dialogue contexts. Despite the promising achievements of the participating teams, the results also highlight the need for evaluation metrics better suited to educational contexts.

READ FULL TEXT
research
06/08/2023

The ADAIO System at the BEA-2023 Shared Task on Generating AI Teacher Responses in Educational Dialogues

This paper presents the ADAIO team's system entry in the Building Educat...
research
05/16/2022

The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues

How can we test whether state-of-the-art generative models, such as Blen...
research
07/06/2023

Covering Uncommon Ground: Gap-Focused Question Generation for Answer Assessment

Human communication often involves information gaps between the interloc...
research
07/09/2023

Assessing the efficacy of large language models in generating accurate teacher responses

(Tack et al., 2023) organized the shared task hosted by the 18th Worksho...
research
05/09/2023

Exploring the Efficacy of ChatGPT in Analyzing Student Teamwork Feedback with an Existing Taxonomy

Teamwork is a critical component of many academic and professional setti...
research
04/14/2023

Learn What Is Possible, Then Choose What Is Best: Disentangling One-To-Many Relations in Language Through Text-based Games

Language models pre-trained on large self-supervised corpora, followed b...
research
08/17/2022

Multimodal Lecture Presentations Dataset: Understanding Multimodality in Educational Slides

Lecture slide presentations, a sequence of pages that contain text and f...

Please sign up or login with your details

Forgot password? Click here to reset