Using Synthetic Data for Conversational Response Generation in Low-resource Settings

04/06/2022
by   Gabriel Louis Tan, et al.
0

Response generation is a task in natural language processing (NLP) where a model is trained to respond to human statements. Conversational response generators take this one step further with the ability to respond within the context of previous responses. While there are existing techniques for training such models, they all require an abundance of conversational data which are not always available for low-resource languages. In this research, we make three contributions. First, we released the first Filipino conversational dataset collected from a popular Philippine online forum, which we named the PEx Conversations Dataset. Second, we introduce a data augmentation (DA) methodology for Filipino data by employing a Tagalog RoBERTa model to increase the size of the existing corpora. Lastly, we published the first Filipino conversational response generator capable of generating responses related to the previous 3 responses. With the supplementary synthetic data, we were able to improve the performance of the response generator by up to 12.2 BERTScore, 10.7 training with zero synthetic data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/12/2018

Addressee and Response Selection for Multilingual Conversation

Developing conversational systems that can converse in many languages is...
research
07/29/2022

Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation

The availability of data in expressive styles across languages is limite...
research
09/09/2023

Data Augmentation for Conversational AI

Advancements in conversational systems have revolutionized information a...
research
09/10/2019

A Benchmark Dataset for Learning to Intervene in Online Hate Speech

Countering online hate speech is a critical yet challenging task, but on...
research
11/06/2019

Guiding Variational Response Generator to Exploit Persona

Leveraging persona information of users in Neural Response Generators (N...
research
08/17/2023

Towards Filling the Gap in Conversational Search: From Passage Retrieval to Conversational Response Generation

Research on conversational search has so far mostly focused on query rew...
research
08/02/2023

Leveraging Few-Shot Data Augmentation and Waterfall Prompting for Response Generation

This paper discusses our approaches for task-oriented conversational mod...

Please sign up or login with your details

Forgot password? Click here to reset