Fast and Accurate Model Scaling

03/11/2021
by   Piotr Dollár, et al.
0

In this work we analyze strategies for convolutional neural network scaling; that is, the process of scaling a base convolutional network to endow it with greater computational complexity and consequently representational power. Example scaling strategies may include increasing model width, depth, resolution, etc. While various scaling strategies exist, their tradeoffs are not fully understood. Existing analysis typically focuses on the interplay of accuracy and flops (floating point operations). Yet, as we demonstrate, various scaling strategies affect model parameters, activations, and consequently actual runtime quite differently. In our experiments we show the surprising result that numerous scaling strategies yield networks with similar accuracy but with widely varying properties. This leads us to propose a simple fast compound scaling strategy that encourages primarily scaling model width, while scaling depth and resolution to a lesser extent. Unlike currently popular scaling strategies, which result in about O(s) increase in model activation w.r.t. scaling flops by a factor of s, the proposed fast compound scaling results in close to O(√(s)) increase in activations, while achieving excellent accuracy. This leads to comparable speedups on modern memory-limited hardware (e.g., GPU, TPU). More generally, we hope this work provides a framework for analyzing and selecting scaling strategies under various computational constraints.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2020

Analysis of Dimensional Influence of Convolutional Neural Networks for Histopathological Cancer Classification

Convolutional Neural Networks can be designed with different levels of c...
research
02/01/2023

Width and Depth Limits Commute in Residual Networks

We show that taking the width and depth to infinity in a deep neural net...
research
11/30/2020

SplitNet: Divide and Co-training

The width of a neural network matters since increasing the width will ne...
research
01/25/2018

Investigating the Effects of Dynamic Precision Scaling on Neural Network Training

Training neural networks is a time- and compute-intensive operation. Thi...
research
03/13/2021

Revisiting ResNets: Improved Training and Scaling Strategies

Novel computer vision architectures monopolize the spotlight, but the im...
research
09/18/2018

MBS: Macroblock Scaling for CNN Model Reduction

We estimate the proper channel (width) scaling of Convolution Neural Net...
research
04/30/2018

Explaining Constraint Interaction: How to Interpret Estimated Model Parameters under Alternative Scaling Methods

In this paper, we explain the reasons behind constraint interaction, whi...

Please sign up or login with your details

Forgot password? Click here to reset