Parameter-Efficient Long-Tailed Recognition

09/18/2023
by   Jiang-Xin Shi, et al.
0

The "pre-training and fine-tuning" paradigm in addressing long-tailed recognition tasks has sparked significant interest since the emergence of large vision-language models like the contrastive language-image pre-training (CLIP). While previous studies have shown promise in adapting pre-trained models for these tasks, they often undesirably require extensive training epochs or additional training data to maintain good performance. In this paper, we propose PEL, a fine-tuning method that can effectively adapt pre-trained models to long-tailed recognition tasks in fewer than 20 epochs without the need for extra data. We first empirically find that commonly used fine-tuning methods, such as full fine-tuning and classifier fine-tuning, suffer from overfitting, resulting in performance deterioration on tail classes. To mitigate this issue, PEL introduces a small number of task-specific parameters by adopting the design of any existing parameter-efficient fine-tuning method. Additionally, to expedite convergence, PEL presents a novel semantic-aware classifier initialization technique derived from the CLIP textual encoder without adding any computational overhead. Our experimental results on four long-tailed datasets demonstrate that PEL consistently outperforms previous state-of-the-art approaches. The source code is available at https://github.com/shijxcs/PEL.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/13/2021

VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks

Recently, fine-tuning language models pre-trained on large text corpora ...
research
05/11/2022

Making Pre-trained Language Models Good Long-tailed Learners

Prompt-tuning has shown appealing performance in few-shot classification...
research
08/05/2021

ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot

One-stage long-tailed recognition methods improve the overall performanc...
research
10/05/2020

Improving AMR Parsing with Sequence-to-Sequence Pre-training

In the literature, the research on abstract meaning representation (AMR)...
research
05/29/2023

Explicit Visual Prompting for Universal Foreground Segmentations

Foreground segmentation is a fundamental problem in computer vision, whi...
research
10/03/2022

LPT: Long-tailed Prompt Tuning for Image Classification

For long-tailed classification, most works often pretrain a big model on...
research
07/25/2023

Benchmarking and Analyzing Generative Data for Visual Recognition

Advancements in large pre-trained generative models have expanded their ...

Please sign up or login with your details

Forgot password? Click here to reset