Crowd Counting by Adaptively Fusing Predictions from an Image Pyramid

05/16/2018
by   Di Kang, et al.
0

Because of the powerful learning capability of deep neural networks, counting performance via density map estimation has improved significantly during the past several years. However, it is still very challenging due to severe occlusion, large scale variations, and perspective distortion. Scale variations (from image to image) coupled with perspective distortion (within one image) resulting in huge scale changes of the object size. Earlier methods based on convolutional neural networks (CNN) typically did not handle this scale variation explicitly, until Hydra-CNN and MCNN. MCNN uses three columns, each with different filter sizes, to extract features at different scales. In this paper, in contrast to using filters of different sizes, we utilize an image pyramid to deal with scale variations. It is more effective and efficient to resize the input fed into the network, as compared to using larger filter sizes. Secondly, we adaptively fuse the predictions from different scales (using adaptively changing per-pixel weights), which makes our method adapt to scale changes within an image. The adaptive fusing is achieved by generating an across-scale attention map, which softly selects a suitable scale for each pixel, followed by a 1x1 convolution. Extensive experiments on three popular datasets show very compelling results.

READ FULL TEXT

page 3

page 4

research
07/05/2018

Perspective-Aware CNN For Crowd Counting

Crowd counting is the task of estimating pedestrian numbers in crowd ima...
research
01/16/2020

PDANet: Pyramid Density-aware Attention Net for Accurate Crowd Counting

Crowd counting, i.e., estimating the number of people in a crowded area,...
research
11/24/2014

Scale-Invariant Convolutional Neural Networks

Even though convolutional neural networks (CNN) has achieved near-human ...
research
11/26/2018

Context-Aware Crowd Counting

State-of-the-art methods for counting people in crowded scenes rely on d...
research
10/04/2020

Multi-Resolution Fusion and Multi-scale Input Priors Based Crowd Counting

Crowd counting in still images is a challenging problem in practice due ...
research
08/08/2019

Dynamic Scale Inference by Entropy Minimization

Given the variety of the visual world there is not one true scale for re...
research
04/05/2023

Trap-Based Pest Counting: Multiscale and Deformable Attention CenterNet Integrating Internal LR and HR Joint Feature Learning

Pest counting, which predicts the number of pests in the early stage, is...

Please sign up or login with your details

Forgot password? Click here to reset