Fisher SAM: Information Geometry and Sharpness Aware Minimisation

06/10/2022
by   Minyoung Kim, et al.
0

Recent sharpness-aware minimisation (SAM) is known to find flat minima which is beneficial for better generalisation with improved robustness. SAM essentially modifies the loss function by reporting the maximum loss value within the small neighborhood around the current iterate. However, it uses the Euclidean ball to define the neighborhood, which can be inaccurate since loss functions for neural networks are typically defined over probability distributions (e.g., class predictive probabilities), rendering the parameter space non Euclidean. In this paper we consider the information geometry of the model parameter space when defining the neighborhood, namely replacing SAM's Euclidean balls with ellipsoids induced by the Fisher information. Our approach, dubbed Fisher SAM, defines more accurate neighborhood structures that conform to the intrinsic metric of the underlying statistical manifold. For instance, SAM may probe the worst-case loss value at either a too nearby or inappropriately distant point due to the ignorance of the parameter space geometry, which is avoided by our Fisher SAM. Another recent Adaptive SAM approach stretches/shrinks the Euclidean ball in accordance with the scale of the parameter magnitudes. This might be dangerous, potentially destroying the neighborhood structure. We demonstrate improved performance of the proposed Fisher SAM on several benchmark datasets/tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2020

Fisher-Rao geometry of Dirichlet distributions

In this paper, we study the geometry induced by the Fisher-Rao metric on...
research
02/24/2019

A Formalization of The Natural Gradient Method for General Similarity Measures

In optimization, the natural gradient method is well-known for likelihoo...
research
05/27/2019

Lightlike Neuromanifolds, Occam's Razor and Deep Learning

Why do deep neural networks generalize with a very high dimensional para...
research
02/14/2008

FINE: Fisher Information Non-parametric Embedding

We consider the problems of clustering, classification, and visualizatio...
research
09/24/2021

Non-Euclidean Self-Organizing Maps

Self-Organizing Maps (SOMs, Kohonen networks) belong to neural network m...
research
02/17/2020

Large-Scale Evaluation of Shape-Aware Neighborhood Weights Neighborhood Sizes

Point sets arise naturally in many 3D acquisition processes and have div...
research
10/12/2021

Robustness of statistical models

A statistical structure (g, T) on a smooth manifold M induced by (M̃, ...

Please sign up or login with your details

Forgot password? Click here to reset