DENet: A Universal Network for Counting Crowd with Varying Densities and Scales

04/17/2019
by   Lei Liu, et al.
0

Counting people or objects with significantly varying scales and densities has attracted much interest from the research community and yet it remains an open problem. In this paper, we propose a simple but an efficient and effective network, named DENet, which is composed of two components, i.e., a detection network (DNet) and an encoder-decoder estimation network (ENet). We first run DNet on an input image to detect and count individuals who can be segmented clearly. Then, ENet is utilized to estimate the density maps of the remaining areas, where the numbers of individuals cannot be detected. We propose a modified Xception as an encoder for feature extraction and a combination of dilated convolution and transposed convolution as a decoder. In the ShanghaiTech Part A, UCF and WorldExpo'10 datasets, our DENet achieves lower Mean Absolute Error (MAE) than those of the state-of-the-art methods.

READ FULL TEXT

page 1

page 4

page 5

page 6

page 7

03/03/2019

Crowd Counting and Density Estimation by Trellis Encoder-Decoder Network

Crowd counting has recently attracted increasing interest in computer vi...
12/18/2017

DecideNet: Counting Varying Density Crowds Through Attention Guided Detection and Density Estimation

In real-world crowd counting applications, the crowd densities vary grea...
03/12/2020

Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting

In this paper, we proposed two modified neural network architectures bas...
01/16/2020

PDANet: Pyramid Density-aware Attention Net for Accurate Crowd Counting

Crowd counting, i.e., estimating the number of people in a crowded area,...
10/30/2018

Estimation of Static and Dynamic Urban Populations with Mobile Network Metadata

Communication-enabled devices routinely carried by individuals have beco...
02/28/2022

FusionCount: Efficient Crowd Counting via Multiscale Feature Fusion

State-of-the-art crowd counting models follow an encoder-decoder approac...