DeepAI AI Chat
Log In Sign Up

Rethinking "Batch" in BatchNorm

by   Yuxin Wu, et al.

BatchNorm is a critical building block in modern convolutional neural networks. Its unique property of operating on "batches" instead of individual samples introduces significantly different behaviors from most other operations in deep learning. As a result, it leads to many hidden caveats that can negatively impact model's performance in subtle ways. This paper thoroughly reviews such problems in visual recognition tasks, and shows that a key to address them is to rethink different choices in the concept of "batch" in BatchNorm. By presenting these caveats and their mitigations, we hope this review can help researchers use BatchNorm more effectively.


page 1

page 2

page 3

page 4


Batch Normalized Recurrent Neural Networks

Recurrent Neural Networks (RNNs) are powerful models for sequential data...

Master's Thesis : Deep Learning for Visual Recognition

The goal of our research is to develop methods advancing automatic visua...

Kernel Normalized Convolutional Networks

Existing deep convolutional neural network (CNN) architectures frequentl...

Large Scale Language Modeling: Converging on 40GB of Text in Four Hours

Recent work has shown how to train Convolutional Neural Networks (CNNs) ...

Batch Prompting: Efficient Inference with Large Language Model APIs

Performing inference on hundreds of thousands of samples with large lang...

Beer2Vec : Extracting Flavors from Reviews for Thirst-Quenching Recommandations

This paper introduces the Beer2Vec model that allows the most popular al...