Balancing Multi-Domain Corpora Learning for Open-Domain Response Generation

05/05/2022
by   Yujie Xing, et al.
8

Open-domain conversational systems are assumed to generate equally good responses on multiple domains. Previous work achieved good performance on the single corpus, but training and evaluating on multiple corpora from different domains are less studied. This paper explores methods of generating relevant responses for each of multiple multi-domain corpora. We first examine interleaved learning which intermingles multiple corpora as the baseline. We then investigate two multi-domain learning methods, labeled learning and multi-task labeled learning, which encode each corpus through a unique corpus embedding. Furthermore, we propose Domain-specific Frequency (DF), a novel word-level importance weight that measures the relative importance of a word for a specific corpus compared to other corpora. Based on DF, we propose weighted learning, a method that integrates DF to the loss function. We also adopt DF as a new evaluation metric. Extensive experiments show that our methods gain significant improvements on both automatic and human evaluation. We share our code and data for reproducibility

READ FULL TEXT
research
05/25/2018

Lifelong Domain Word Embedding via Meta-Learning

Learning high-quality domain word embeddings is important for achieving ...
research
03/07/2020

Multi-task Learning Based Neural Bridging Reference Resolution

We propose a multi task learning-based neural model for bridging referen...
research
06/07/2019

Learning Word Embeddings with Domain Awareness

Word embeddings are traditionally trained on a large corpus in an unsupe...
research
08/19/2020

FinChat: Corpus and evaluation setup for Finnish chat conversations on everyday topics

Creating open-domain chatbots requires large amounts of conversational d...
research
10/13/2021

FlexiTerm: A more efficient implementation of flexible multi-word term recognition

Terms are linguistic signifiers of domain-specific concepts. Automated r...
research
11/29/2022

Towards Generalized Open Information Extraction

Open Information Extraction (OpenIE) facilitates the open-domain discove...

Please sign up or login with your details

Forgot password? Click here to reset