Pro-tuning: Unified Prompt Tuning for Vision Tasks

07/28/2022
by   Xing Nie, et al.
0

In computer vision, fine-tuning is the de-facto approach to leverage pre-trained vision models to perform downstream tasks. However, deploying it in practice is quite challenging, due to adopting parameter inefficient global update and heavily relying on high-quality downstream data. Recently, prompt-based learning, which adds a task-relevant prompt to adapt the downstream tasks to pre-trained models, has drastically boosted the performance of many natural language downstream tasks. In this work, we extend this notable transfer ability benefited from prompt into vision models as an alternative to fine-tuning. To this end, we propose parameter-efficient Prompt tuning (Pro-tuning) to adapt frozen vision models to various downstream vision tasks. The key to Pro-tuning is prompt-based tuning, i.e., learning task-specific vision prompts for downstream input images with the pre-trained model frozen. By only training a few additional parameters, it can work on diverse CNN-based and Transformer-based architectures. Extensive experiments evidence that Pro-tuning outperforms fine-tuning in a broad range of vision tasks and scenarios, including image classification (generic objects, class imbalance, image corruption, adversarial robustness, and out-of-distribution generalization), and dense prediction tasks such as object detection and semantic segmentation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/03/2022

Towards a Unified View on Visual Parameter-Efficient Transfer Learning

Since the release of various large-scale natural language processing (NL...
research
04/04/2023

Improved Visual Fine-tuning with Natural Language Supervision

Fine-tuning a pre-trained model can leverage the semantic information fr...
research
09/15/2023

SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels

Pre-trained vision transformers have strong representation benefits to v...
research
06/01/2023

Consistency-guided Prompt Learning for Vision-Language Models

We propose Consistency-guided Prompt learning (CoPrompt), a new fine-tun...
research
05/30/2023

ConES: Concept Embedding Search for Parameter Efficient Tuning Large Vision Language Models

Large pre-trained vision-language models have shown great prominence in ...
research
10/06/2022

SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data

Recent success in fine-tuning large models, that are pretrained on broad...
research
05/04/2023

Prompt-ICM: A Unified Framework towards Image Coding for Machines with Task-driven Prompts

Image coding for machines (ICM) aims to compress images to support downs...

Please sign up or login with your details

Forgot password? Click here to reset