Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform

07/11/2023
by   Mateusz Wójcik, et al.
0

Production deployments in complex systems require ML architectures to be highly efficient and usable against multiple tasks. Particularly demanding are classification problems in which data arrives in a streaming fashion and each class is presented separately. Recent methods with stochastic gradient learning have been shown to struggle in such setups or have limitations like memory buffers, and being restricted to specific domains that disable its usage in real-world scenarios. For this reason, we present a fully differentiable architecture based on the Mixture of Experts model, that enables the training of high-performance classifiers when examples from each class are presented separately. We conducted exhaustive experiments that proved its applicability in various domains and ability to learn online in production environments. The proposed technique achieves SOTA results without a memory buffer and clearly outperforms the reference methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/27/2022

Neural Architecture for Online Ensemble Continual Learning

Continual learning with an increasing number of classes is a challenging...
research
03/12/2019

Continual Learning in Practice

This paper describes a reference architecture for self-maintaining syste...
research
07/09/2023

Class-Incremental Mixture of Gaussians for Deep Continual Learning

Continual learning models for stationary data focus on learning and reta...
research
04/08/2022

General Incremental Learning with Domain-aware Categorical Representations

Continual learning is an important problem for achieving human-level int...
research
04/20/2023

A baseline on continual learning methods for video action recognition

Continual learning has recently attracted attention from the research co...
research
06/21/2023

TADIL: Task-Agnostic Domain-Incremental Learning through Task-ID Inference using Transformer Nearest-Centroid Embeddings

Machine Learning (ML) models struggle with data that changes over time o...
research
08/14/2023

Ada-QPacknet – adaptive pruning with bit width reduction as an efficient continual learning method without forgetting

Continual Learning (CL) is a process in which there is still huge gap be...

Please sign up or login with your details

Forgot password? Click here to reset