PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection

08/29/2016
by   Kye-Hyeon Kim, et al.
0

This paper presents how we can achieve the state-of-the-art accuracy in multi-category object detection task while minimizing the computational cost by adapting and combining recent technical innovations. Following the common pipeline of "CNN feature extraction + region proposal + RoI classification", we mainly redesign the feature extraction part, since region proposal part is not computationally expensive and classification part can be efficiently compressed with common techniques like truncated SVD. Our design principle is "less channels with more layers" and adoption of some building blocks including concatenated ReLU, Inception, and HyperNet. The designed network is deep and thin and trained with the help of batch normalization, residual connections, and learning rate scheduling based on plateau detection. We obtained solid results on well-known object detection benchmarks: 83.8 precision) on VOC2007 and 82.5 750ms/image on Intel i7-6700K CPU with a single core and 46ms/image on NVIDIA Titan X GPU. Theoretically, our network requires only 12.3 computational cost compared to ResNet-101, the winner on VOC2012.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2016

PVANet: Lightweight Deep Neural Networks for Real-time Object Detection

In object detection, reducing computational cost is as important as impr...
research
02/04/2017

Wide-Residual-Inception Networks for Real-time Object Detection

Since convolutional neural network(CNN)models emerged,several tasks in c...
research
08/25/2022

Anytime-Lidar: Deadline-aware 3D Object Detection

In this work, we present a novel scheduling framework enabling anytime p...
research
12/09/2015

Get More With Less: Near Real-Time Image Clustering on Mobile Phones

Machine learning algorithms, in conjunction with user data, hold the pro...
research
11/11/2017

Deep Residual Text Detection Network for Scene Text

Scene text detection is a challenging problem in computer vision. In thi...
research
06/15/2021

A Lightweight ReLU-Based Feature Fusion for Aerial Scene Classification

In this paper, we propose a transfer-learning based model construction t...
research
01/16/2022

YOLO – You only look 10647 times

With this work we are explaining the "You Only Look Once" (YOLO) single-...

Please sign up or login with your details

Forgot password? Click here to reset