Heterogeneous-Branch Collaborative Learning for Dialogue Generation

03/21/2023
by   Yiwei Li, et al.
0

With the development of deep learning, advanced dialogue generation methods usually require a greater amount of computational resources. One promising approach to obtaining a high-performance and lightweight model is knowledge distillation, which relies heavily on the pre-trained powerful teacher. Collaborative learning, also known as online knowledge distillation, is an effective way to conduct one-stage group distillation in the absence of a well-trained large teacher model. However, previous work has a severe branch homogeneity problem due to the same training objective and the independent identical training sets. To alleviate this problem, we consider the dialogue attributes in the training of network branches. Each branch learns the attribute-related features based on the selected subset. Furthermore, we propose a dual group-based knowledge distillation method, consisting of positive distillation and negative distillation, to further diversify the features of different branches in a steadily and interpretable way. The proposed approach significantly improves branch heterogeneity and outperforms state-of-the-art collaborative learning methods on two widely used open-domain dialogue datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2020

Peer Collaborative Learning for Online Knowledge Distillation

Traditional knowledge distillation uses a two-stage training strategy to...
research
02/21/2023

Two-in-one Knowledge Distillation for Efficient Facial Forgery Detection

Facial forgery detection is a crucial but extremely challenging topic, w...
research
03/22/2022

Channel Self-Supervision for Online Knowledge Distillation

Recently, researchers have shown an increased interest in the online kno...
research
11/22/2021

Hierarchical Knowledge Distillation for Dialogue Sequence Labeling

This paper presents a novel knowledge distillation method for dialogue s...
research
09/18/2023

Facilitating NSFW Text Detection in Open-Domain Dialogue Systems via Knowledge Distillation

NSFW (Not Safe for Work) content, in the context of a dialogue, can have...
research
08/04/2021

Online Knowledge Distillation for Efficient Pose Estimation

Existing state-of-the-art human pose estimation methods require heavy co...
research
03/18/2023

Crowd Counting with Online Knowledge Learning

Efficient crowd counting models are urgently required for the applicatio...

Please sign up or login with your details

Forgot password? Click here to reset