Testing the Efficient Network TRaining (ENTR) Hypothesis: initially reducing training image size makes Convolutional Neural Network training for image recognition tasks more ef

07/30/2018
by   Thomas Cherico Wanger, et al.
10

Convolutional Neural Networks (CNN) for image recognition tasks are seeing rapid advances in the available architectures and how networks are trained based on large computational infrastructure and standard datasets with millions of images. In contrast, performance and time constraints for example, of small devices and free cloud GPUs necessitate efficient network training (i.e., highest accuracy in the shortest inference time possible), often on small datasets. Here, we hypothesize that initially decreasing image size during training makes the training process more efficient, because pre-shaping weights with small images and later utilizing these weights with larger images reduces initial network parameters and total inference time. We test this Efficient Network TRaining (ENTR) Hypothesis by training pre-trained Residual Network (ResNet) models (ResNet18, 34, & 50) on three small datasets (steel microstructures, bee images, and geographic aerial images) with a free cloud GPU. Based on three training regimes of i) not, ii) gradually or iii) in one step increasing image size over the training process, we show that initially reducing image size increases training efficiency consistently across datasets and networks. We interpret these results mechanistically in the framework of regularization theory. Support for the ENTR hypothesis is an important contribution, because network efficiency improvements for image recognition tasks are needed for practical applications. In the future, it will be exciting to see how the ENTR hypothesis holds for large standard datasets like ImageNet or CIFAR, to better understand the underlying mechanisms, and how these results compare to other fields such as structural learning.

READ FULL TEXT

page 4

page 5

page 7

page 8

page 9

research
09/01/2018

Evaluation of Neural Networks for Image Recognition Applications: Designing a 0-1 MILP Model of a CNN to create adversarials

Image Recognition is a central task in computer vision with applications...
research
11/16/2018

Assessing four Neural Networks on Handwritten Digit Recognition Dataset (MNIST)

Although the image recognition has been a research topic for many years,...
research
01/09/2019

Image Recognition of Tea Leaf Diseases Based on Convolutional Neural Network

In order to identify and prevent tea leaf diseases effectively, convolut...
research
04/22/2016

Refining Architectures of Deep Convolutional Neural Networks

Deep Convolutional Neural Networks (CNNs) have recently evinced immense ...
research
11/01/2022

Exploring Effects of Computational Parameter Changes to Image Recognition Systems

Image recognition tasks typically use deep learning and require enormous...

Please sign up or login with your details

Forgot password? Click here to reset