Continual Learning with Dirichlet Generative-based Rehearsal

09/13/2023
by   Min Zeng, et al.
0

Recent advancements in data-driven task-oriented dialogue systems (ToDs) struggle with incremental learning due to computational constraints and time-consuming issues. Continual Learning (CL) attempts to solve this by avoiding intensive pre-training, but it faces the problem of catastrophic forgetting (CF). While generative-based rehearsal CL methods have made significant strides, generating pseudo samples that accurately reflect the underlying task-specific distribution is still a challenge. In this paper, we present Dirichlet Continual Learning (DCL), a novel generative-based rehearsal strategy for CL. Unlike the traditionally used Gaussian latent variable in the Conditional Variational Autoencoder (CVAE), DCL leverages the flexibility and versatility of the Dirichlet distribution to model the latent prior variable. This enables it to efficiently capture sentence-level features of previous tasks and effectively guide the generation of pseudo samples. In addition, we introduce Jensen-Shannon Knowledge Distillation (JSKD), a robust logit-based knowledge distillation method that enhances knowledge transfer during pseudo sample generation. Our experiments confirm the efficacy of our approach in both intent detection and slot-filling tasks, outperforming state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/15/2021

Learning Invariant Representation for Continual Learning

Continual learning aims to provide intelligent agents that are capable o...
research
10/14/2022

Prompt Conditioned VAE: Enhancing Generative Replay for Lifelong Learning in Task-Oriented Dialogue

Lifelong learning (LL) is vital for advanced task-oriented dialogue (ToD...
research
03/31/2022

A Closer Look at Rehearsal-Free Continual Learning

Continual learning describes a setting where machine learning models lea...
research
07/12/2021

Kernel Continual Learning

This paper introduces kernel continual learning, a simple but effective ...
research
09/02/2023

Big-model Driven Few-shot Continual Learning

Few-shot continual learning (FSCL) has attracted intensive attention and...
research
03/28/2023

Projected Latent Distillation for Data-Agnostic Consolidation in Distributed Continual Learning

Distributed learning on the edge often comprises self-centered devices (...
research
08/11/2023

Continual Face Forgery Detection via Historical Distribution Preserving

Face forgery techniques have advanced rapidly and pose serious security ...

Please sign up or login with your details

Forgot password? Click here to reset