CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model

04/09/2023
by   Dingkang Liang, et al.
0

Supervised crowd counting relies heavily on costly manual labeling, which is difficult and expensive, especially in dense scenes. To alleviate the problem, we propose a novel unsupervised framework for crowd counting, named CrowdCLIP. The core idea is built on two observations: 1) the recent contrastive pre-trained vision-language model (CLIP) has presented impressive performance on various downstream tasks; 2) there is a natural mapping between crowd patches and count text. To the best of our knowledge, CrowdCLIP is the first to investigate the vision language knowledge to solve the counting problem. Specifically, in the training stage, we exploit the multi-modal ranking loss by constructing ranking text prompts to match the size-sorted crowd patches to guide the image encoder learning. In the testing stage, to deal with the diversity of image patches, we propose a simple yet effective progressive filtering strategy to first select the highly potential crowd patches and then map them into the language space with various counting intervals. Extensive experiments on five challenging datasets demonstrate that the proposed CrowdCLIP achieves superior performance compared to previous unsupervised state-of-the-art counting methods. Notably, CrowdCLIP even surpasses some popular fully-supervised methods under the cross-dataset setting. The source code will be available at https://github.com/dk-liang/CrowdCLIP.

READ FULL TEXT

page 1

page 4

page 6

page 13

research
04/19/2021

TransCrowd: Weakly-Supervised Crowd Counting with Transformer

The mainstream crowd counting methods usually utilize the convolution ne...
research
03/16/2023

Cross-head Supervision for Crowd Counting with Noisy Annotations

Noisy annotations such as missing annotations and location shifts often ...
research
05/12/2020

Adaptive Mixture Regression Network with Local Counting Map for Crowd Counting

The crowd counting task aims at estimating the number of people located ...
research
01/13/2022

S^2FPR: Crowd Counting via Self-Supervised Coarse to Fine Feature Pyramid Ranking

Most conventional crowd counting methods utilize a fully-supervised lear...
research
03/25/2019

CODA: Counting Objects via Scale-aware Adversarial Density Adaption

Recent advances in crowd counting have achieved promising results with i...
research
07/27/2021

Uniformity in Heterogeneity:Diving Deep into Count Interval Partition for Crowd Counting

Recently, the problem of inaccurate learning targets in crowd counting d...
research
08/26/2023

Point-Query Quadtree for Crowd Counting, Localization, and More

We show that crowd counting can be viewed as a decomposable point queryi...

Please sign up or login with your details

Forgot password? Click here to reset