Conversation Regression Testing: A Design Technique for Prototyping Generalizable Prompt Strategies for Pre-trained Language Models

02/06/2023
by   J. D. Zamfirescu-Pereira, et al.
0

Pre-trained language models (LLMs) such as GPT-3 can carry fluent, multi-turn conversations out-of-the-box, making them attractive materials for chatbot design. Further, designers can improve LLM chatbot utterances by prepending textual prompts – instructions and examples of desired interactions – to its inputs. However, prompt-based improvements can be brittle; designers face challenges systematically understanding how a prompt strategy might impact the unfolding of subsequent conversations across users. To address this challenge, we introduce the concept of Conversation Regression Testing. Based on sample conversations with a baseline chatbot, Conversation Regression Testing tracks how conversational errors persist or are resolved by applying different prompt strategies. We embody this technique in an interactive design tool, BotDesigner, that lets designers identify archetypal errors across multiple conversations; shows common threads of conversation using a graph visualization; and highlights the effects of prompt changes across bot design iterations. A pilot evaluation demonstrates the usefulness of both the concept of regression testing and the functionalities of BotDesigner for chatbot designers.

READ FULL TEXT
research
06/07/2021

Summary Grounded Conversation Generation

Many conversation datasets have been constructed in the recent years usi...
research
08/19/2022

Adapting Task-Oriented Dialogue Models for Email Conversations

Intent detection is a key part of any Natural Language Understanding (NL...
research
04/09/2022

TANet: Thread-Aware Pretraining for Abstractive Conversational Summarization

Although pre-trained language models (PLMs) have achieved great success ...
research
05/20/2021

Towards Detecting Need for Empathetic Response in Motivational Interviewing

Empathetic response from the therapist is key to the success of clinical...
research
08/10/2023

C5: Towards Better Conversation Comprehension and Contextual Continuity for ChatGPT

Large language models (LLMs), such as ChatGPT, have demonstrated outstan...
research
12/16/2021

Call for Customized Conversation: Customized Conversation Grounding Persona and Knowledge

Humans usually have conversations by making use of prior knowledge about...
research
08/16/2023

MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation

We propose MemoChat, a pipeline for refining instructions that enables l...

Please sign up or login with your details

Forgot password? Click here to reset