TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter

06/22/2023
by   Binjie Zhang, et al.
0

Visual foundation models like CLIP excel in learning feature representations from extensive datasets through self-supervised methods, demonstrating remarkable transfer learning and generalization capabilities. A growing number of applications based on visual foundation models are emerging, including innovative solutions such as BLIP-2. These applications employ pre-trained CLIP models as upstream feature extractors and train various downstream modules to accomplish diverse tasks. In situations involving system upgrades that require updating the upstream foundation model, it becomes essential to re-train all downstream modules to adapt to the new foundation model, which is inflexible and inefficient. In this paper, we introduce a parameter-efficient and task-agnostic adapter, dubbed TaCA, that facilitates compatibility across distinct foundation models while ensuring enhanced performance for the new models. TaCA allows downstream applications to seamlessly integrate better-performing foundation models without necessitating retraining. We conduct extensive experimental validation of TaCA using different scales of models with up to one billion parameters on various tasks such as video-text retrieval, video recognition, and visual question answering. The results consistently demonstrate the emergent ability of TaCA on hot-plugging upgrades for visual foundation models. Codes and models will be available at https://github.com/TencentARC/TaCA.

READ FULL TEXT
research
06/29/2023

Foundation Model for Endoscopy Video Analysis via Large-scale Self-supervised Pre-train

Foundation models have exhibited remarkable success in various applicati...
research
12/06/2022

InternVideo: General Video Foundation Models via Generative and Discriminative Learning

The foundation models have recently shown excellent performance on a var...
research
04/05/2023

Towards Efficient Task-Driven Model Reprogramming with Foundation Models

Vision foundation models exhibit impressive power, benefiting from the e...
research
03/08/2023

Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models

ChatGPT is attracting a cross-field interest as it provides a language i...
research
05/09/2023

Comparing Foundation Models using Data Kernels

Recent advances in self-supervised learning and neural network scaling h...
research
02/14/2023

Cliff-Learning

We study the data-scaling of transfer learning from foundation models in...
research
08/11/2023

Foundation Model is Efficient Multimodal Multitask Model Selector

This paper investigates an under-explored but important problem: given a...

Please sign up or login with your details

Forgot password? Click here to reset