Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models

08/22/2023
by   Adrián Csiszárik, et al.
0

We explore element-wise convex combinations of two permutation-aligned neural network parameter vectors Θ_A and Θ_B of size d. We conduct extensive experiments by examining various distributions of such model combinations parametrized by elements of the hypercube [0,1]^d and its vicinity. Our findings reveal that broad regions of the hypercube form surfaces of low loss values, indicating that the notion of linear mode connectivity extends to a more general phenomenon which we call mode combinability. We also make several novel observations regarding linear mode connectivity and model re-basin. We demonstrate a transitivity property: two models re-based to a common third model are also linear mode connected, and a robustness property: even with significant perturbations of the neuron matchings the resulting combinations continue to form a working model. Moreover, we analyze the functional and weight similarity of model combinations and show that such combinations are non-vacuous in the sense that there are significant functional differences between the resulting models.

READ FULL TEXT

page 15

page 17

page 27

page 28

research
09/05/2020

Optimizing Mode Connectivity via Neuron Alignment

The loss landscapes of deep neural networks are not well understood due ...
research
10/13/2022

Wasserstein Barycenter-based Model Fusion and Linear Mode Connectivity of Neural Networks

Based on the concepts of Wasserstein barycenter (WB) and Gromov-Wasserst...
research
09/11/2022

Git Re-Basin: Merging Models modulo Permutation Symmetries

The success of deep learning is thanks to our ability to solve certain m...
research
11/15/2022

Mechanistic Mode Connectivity

Neural networks are known to be biased towards learning mechanisms that ...
research
05/07/2019

A Generative Model for Sampling High-Performance and Diverse Weights for Neural Networks

Recent work on mode connectivity in the loss landscape of deep neural ne...
research
05/19/2022

Interpolating Compressed Parameter Subspaces

Inspired by recent work on neural subspaces and mode connectivity, we re...
research
10/25/2022

Exploring Mode Connectivity for Pre-trained Language Models

Recent years have witnessed the prevalent application of pre-trained lan...

Please sign up or login with your details

Forgot password? Click here to reset