FusionCount: Efficient Crowd Counting via Multiscale Feature Fusion

02/28/2022
by   Yiming Ma, et al.
0

State-of-the-art crowd counting models follow an encoder-decoder approach. Images are first processed by the encoder to extract features. Then, to account for perspective distortion, the highest-level feature map is fed to extra components to extract multiscale features, which are the input to the decoder to generate crowd densities. However, in these methods, features extracted at earlier stages during encoding are underutilised, and the multiscale modules can only capture a limited range of receptive fields, albeit with considerable computational cost. This paper proposes a novel crowd counting architecture (FusionCount), which exploits the adaptive fusion of a large majority of encoded features instead of relying on additional extraction components to obtain multiscale features. Thus, it can cover a more extensive scope of receptive field sizes and lower the computational cost. We also introduce a new channel reduction block, which can extract saliency information during decoding and further enhance the model's performance. Experiments on two benchmark databases demonstrate that our model achieves state-of-the-art results with reduced computational complexity.

READ FULL TEXT
research
12/06/2018

Adaptive Scenario Discovery for Crowd Counting

Crowd counting, i.e., estimation number of pedestrian in crowd images, i...
research
08/18/2018

In Defense of Single-column Networks for Crowd Counting

Crowd counting usually addressed by density estimation becomes an increa...
research
10/31/2021

PANet: Perspective-Aware Network with Dynamic Receptive Fields and Self-Distilling Supervision for Crowd Counting

Crowd counting aims to learn the crowd density distributions and estimat...
research
09/27/2019

MRCNet: Crowd Counting and Density Map Estimation in Aerial and Ground Imagery

In spite of the many advantages of aerial imagery for crowd monitoring a...
research
06/09/2023

An Efficient Speech Separation Network Based on Recurrent Fusion Dilated Convolution and Channel Attention

We present an efficient speech separation neural network, ARFDCN, which ...
research
02/21/2022

Multiscale Crowd Counting and Localization By Multitask Point Supervision

We propose a multitask approach for crowd counting and person localizati...
research
09/01/2023

ARFA: An Asymmetric Receptive Field Autoencoder Model for Spatiotemporal Prediction

Spatiotemporal prediction aims to generate future sequences by paradigms...

Please sign up or login with your details

Forgot password? Click here to reset