Learning without Forgetting for Vision-Language Models

05/30/2023
by   Da-Wei Zhou, et al.
0

Class-Incremental Learning (CIL) or continual learning is a desired capability in the real world, which requires a learning system to adapt to new tasks without forgetting former ones. While traditional CIL methods focus on visual information to grasp core features, recent advances in Vision-Language Models (VLM) have shown promising capabilities in learning generalizable representations with the aid of textual information. However, when continually trained with new classes, VLMs often suffer from catastrophic forgetting of former knowledge. Applying VLMs to CIL poses two major challenges: 1) how to adapt the model without forgetting; and 2) how to make full use of the multi-modal information. To this end, we propose PROjectiOn Fusion (PROOF) that enables VLMs to learn without forgetting. To handle the first challenge, we propose training task-specific projections based on the frozen image/text encoders. When facing new tasks, new projections are expanded and former projections are fixed, alleviating the forgetting of old concepts. For the second challenge, we propose the fusion module to better utilize the cross-modality information. By jointly adjusting visual and textual features, the model can capture semantic information with stronger representation ability. Extensive experiments on nine benchmark datasets validate PROOF achieves state-of-the-art performance.

READ FULL TEXT

page 10

page 14

research
06/18/2022

CLiMB: A Continual Learning Benchmark for Vision-and-Language Tasks

Current state-of-the-art vision-and-language models are evaluated on tas...
research
05/12/2023

RepCL: Exploring Effective Representation for Continual Text Classification

Continual learning (CL) aims to constantly learn new knowledge over time...
research
03/24/2023

Leveraging Old Knowledge to Continually Learn New Classes in Medical Images

Class-incremental continual learning is a core step towards developing a...
research
09/11/2023

Class-Incremental Grouping Network for Continual Audio-Visual Learning

Continual learning is a challenging problem in which models need to be t...
research
02/02/2023

Continual Learning with Scaled Gradient Projection

In neural networks, continual learning results in gradient interference ...
research
04/21/2022

Referring Expression Comprehension via Cross-Level Multi-Modal Fusion

As an important and challenging problem in vision-language tasks, referr...
research
10/07/2021

Towards Continual Knowledge Learning of Language Models

Large Language Models (LMs) are known to encode world knowledge in their...

Please sign up or login with your details

Forgot password? Click here to reset