Understanding Hessian Alignment for Domain Generalization

08/22/2023
by   Sobhan Hemati, et al.
0

Out-of-distribution (OOD) generalization is a critical ability for deep learning models in many real-world scenarios including healthcare and autonomous vehicles. Recently, different techniques have been proposed to improve OOD generalization. Among these methods, gradient-based regularizers have shown promising performance compared with other competitors. Despite this success, our understanding of the role of Hessian and gradient alignment in domain generalization is still limited. To address this shortcoming, we analyze the role of the classifier's head Hessian matrix and gradient in domain generalization using recent OOD theory of transferability. Theoretically, we show that spectral norm between the classifier's head Hessian matrices across domains is an upper bound of the transfer measure, a notion of distance between target and source domains. Furthermore, we analyze all the attributes that get aligned when we encourage similarity between Hessians and gradients. Our analysis explains the success of many regularizers like CORAL, IRM, V-REx, Fish, IGA, and Fishr as they regularize part of the classifier's head Hessian and/or gradient. Finally, we propose two simple yet effective methods to match the classifier's head Hessians and gradients in an efficient way, based on the Hessian Gradient Product (HGP) and Hutchinson's method (Hutchinson), and without directly calculating Hessians. We validate the OOD generalization ability of proposed methods in different scenarios, including transferability, severe correlation shift, label shift and diversity shift. Our results show that Hessian alignment methods achieve promising performance on various OOD benchmarks. The code is available at <https://github.com/huawei-noah/Federated-Learning/tree/main/HessianAlignment>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2023

PGrad: Learning Principal Gradients For Domain Generalization

Machine learning models fail to perform when facing out-of-distribution ...
research
06/16/2020

Gradient Alignment in Deep Neural Networks

One cornerstone of interpretable deep learning is the high degree of vis...
research
03/17/2022

On Multi-Domain Long-Tailed Recognition, Generalization and Beyond

Real-world data often exhibit imbalanced label distributions. Existing s...
research
06/20/2023

Personalized Federated Learning with Feature Alignment and Classifier Collaboration

Data heterogeneity is one of the most challenging issues in federated le...
research
02/20/2023

Nystrom Method for Accurate and Scalable Implicit Differentiation

The essential difficulty of gradient-based bilevel optimization using im...
research
11/10/2022

How Does Sharpness-Aware Minimization Minimize Sharpness?

Sharpness-Aware Minimization (SAM) is a highly effective regularization ...
research
09/19/2018

Identifying Generalization Properties in Neural Networks

While it has not yet been proven, empirical evidence suggests that model...

Please sign up or login with your details

Forgot password? Click here to reset