Large-batch Optimization for Dense Visual Predictions

10/20/2022
by   Zeyue Xue, et al.
0

Training a large-scale deep neural network in a large-scale dataset is challenging and time-consuming. The recent breakthrough of large-batch optimization is a promising way to tackle this challenge. However, although the current advanced algorithms such as LARS and LAMB succeed in classification models, the complicated pipelines of dense visual predictions such as object detection and segmentation still suffer from the heavy performance drop in the large-batch training regime. To address this challenge, we propose a simple yet effective algorithm, named Adaptive Gradient Variance Modulator (AGVM), which can train dense visual predictors with very large batch size, enabling several benefits more appealing than prior arts. Firstly, AGVM can align the gradient variances between different modules in the dense visual predictors, such as backbone, feature pyramid network (FPN), detection, and segmentation heads. We show that training with a large batch size can fail with the gradient variances misaligned among them, which is a phenomenon primarily overlooked in previous work. Secondly, AGVM is a plug-and-play module that generalizes well to many different architectures (e.g., CNNs and Transformers) and different tasks (e.g., object detection, instance segmentation, semantic segmentation, and panoptic segmentation). It is also compatible with different optimizers (e.g., SGD and AdamW). Thirdly, a theoretical analysis of AGVM is provided. Extensive experiments on the COCO and ADE20K datasets demonstrate the superiority of AGVM. For example, it can train Faster R-CNN+ResNet50 in 4 minutes without losing performance. AGVM enables training an object detector with one billion parameters in just 3.5 hours, reducing the training time by 20.9x, whilst achieving 62.2 mAP on COCO. The deliverables are released at https://github.com/Sense-X/AGVM.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2023

CFNet: Cascade Fusion Network for Dense Prediction

Multi-scale features are essential for dense prediction tasks, including...
research
11/20/2017

MegDet: A Large Mini-Batch Object Detector

The improvements in recent CNN-based object detection works, from R-CNN ...
research
07/21/2021

CycleMLP: A MLP-like Architecture for Dense Prediction

This paper presents a simple MLP-like architecture, CycleMLP, which is a...
research
01/21/2022

A Comprehensive Study of Vision Transformers on Dense Prediction Tasks

Convolutional Neural Networks (CNNs), architectures consisting of convol...
research
03/30/2021

Exploiting Invariance in Training Deep Neural Networks

Inspired by two basic mechanisms in animal visual systems, we introduce ...
research
11/10/2022

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

Compared to the great progress of large-scale vision transformers (ViTs)...
research
05/23/2018

SNIPER: Efficient Multi-Scale Training

We present SNIPER, an algorithm for performing efficient multi-scale tra...

Please sign up or login with your details

Forgot password? Click here to reset