CounTR: Transformer-based Generalised Visual Counting

08/29/2022
by   Chang Liu, et al.
2

In this paper, we consider the problem of generalised visual object counting, with the goal of developing a computational model for counting the number of objects from arbitrary semantic categories, using arbitrary number of "exemplars", i.e. zero-shot or few-shot counting. To this end, we make the following four contributions: (1) We introduce a novel transformer-based architecture for generalised visual object counting, termed as Counting Transformer (CounTR), which explicitly capture the similarity between image patches or with given "exemplars" with the attention mechanism;(2) We adopt a two-stage training regime, that first pre-trains the model with self-supervised learning, and followed by supervised fine-tuning;(3) We propose a simple, scalable pipeline for synthesizing training images with a large number of instances or that from different semantic categories, explicitly forcing the model to make use of the given "exemplars";(4) We conduct thorough ablation studies on the large-scale counting benchmark, e.g. FSC-147, and demonstrate state-of-the-art performance on both zero and few-shot settings.

READ FULL TEXT

page 4

page 6

page 7

page 11

page 12

page 16

research
03/03/2023

Zero-shot Object Counting

Class-agnostic object counting aims to count object instances of an arbi...
research
03/21/2023

Multi-modal Prompting for Low-Shot Temporal Action Localization

In this paper, we consider the problem of temporal action localization u...
research
03/20/2022

CLIP on Wheels: Zero-Shot Object Navigation as Object Localization and Exploration

Households across the world contain arbitrary objects: from mate gourds ...
research
01/05/2023

MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training

In this paper, we consider the problem of enhancing self-supervised visu...
research
02/10/2023

GCNet: Probing Self-Similarity Learning for Generalized Counting Network

The class-agnostic counting (CAC) problem has caught increasing attentio...
research
10/10/2022

FS-DETR: Few-Shot DEtection TRansformer with prompting and without re-training

This paper is on Few-Shot Object Detection (FSOD), where given a few tem...
research
11/15/2022

A Low-Shot Object Counting Network With Iterative Prototype Adaptation

We consider low-shot counting of arbitrary semantic categories in the im...

Please sign up or login with your details

Forgot password? Click here to reset