Normalization Is All You Need: Understanding Layer-Normalized Federated Learning under Extreme Label Shift

08/18/2023
by   Guojun Zhang, et al.
0

Layer normalization (LN) is a widely adopted deep learning technique especially in the era of foundation models. Recently, LN has been shown to be surprisingly effective in federated learning (FL) with non-i.i.d. data. However, exactly why and how it works remains mysterious. In this work, we reveal the profound connection between layer normalization and the label shift problem in federated learning. To understand layer normalization better in FL, we identify the key contributing mechanism of normalization methods in FL, called feature normalization (FN), which applies normalization to the latent feature representation before the classifier head. Although LN and FN do not improve expressive power, they control feature collapse and local overfitting to heavily skewed datasets, and thus accelerates global training. Empirically, we show that normalization leads to drastic improvements on standard benchmarks under extreme label shift. Moreover, we conduct extensive ablation studies to understand the critical factors of layer normalization in FL. Our results verify that FN is an essential ingredient inside LN to significantly improve the convergence of FL while remaining robust to learning rate choices, especially under extreme label shift where each client has access to few classes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2023

Is Normalization Indispensable for Multi-domain Federated Learning?

Federated learning (FL) enhances data privacy with collaborative in-situ...
research
03/19/2023

Experimenting with Normalization Layers in Federated Learning on non-IID scenarios

Training Deep Learning (DL) models require large, high-quality datasets,...
research
03/12/2023

Making Batch Normalization Great in Federated Deep Learning

Batch Normalization (BN) is commonly used in modern deep neural networks...
research
01/08/2023

Why Batch Normalization Damage Federated Learning on Non-IID Data?

As a promising distributed learning paradigm, federated learning (FL) in...
research
09/30/2022

Kernel Normalized Convolutional Networks for Privacy-Preserving Machine Learning

Normalization is an important but understudied challenge in privacy-rela...
research
11/12/2020

Heterogeneous Data-Aware Federated Learning

Federated learning (FL) is an appealing concept to perform distributed t...

Please sign up or login with your details

Forgot password? Click here to reset