Wasserstein Barycenter-based Model Fusion and Linear Mode Connectivity of Neural Networks

10/13/2022
by   Aditya Kumar Akash, et al.
0

Based on the concepts of Wasserstein barycenter (WB) and Gromov-Wasserstein barycenter (GWB), we propose a unified mathematical framework for neural network (NN) model fusion and utilize it to reveal new insights about the linear mode connectivity of SGD solutions. In our framework, the fusion occurs in a layer-wise manner and builds on an interpretation of a node in a network as a function of the layer preceding it. The versatility of our mathematical framework allows us to talk about model fusion and linear mode connectivity for a broad class of NNs, including fully connected NN, CNN, ResNet, RNN, and LSTM, in each case exploiting the specific structure of the network architecture. We present extensive numerical experiments to: 1) illustrate the strengths of our approach in relation to other model fusion methodologies and 2) from a certain perspective, provide new empirical evidence for recent conjectures which say that two local minima found by gradient-based methods end up lying on the same basin of the loss landscape after a proper permutation of weights is applied to one of the models.

READ FULL TEXT

page 9

page 23

page 24

research
08/22/2023

Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models

We explore element-wise convex combinations of two permutation-aligned n...
research
06/18/2018

Using Mode Connectivity for Loss Landscape Analysis

Mode connectivity is a recently introduced frame- work that empirically ...
research
06/16/2021

Input Invex Neural Network

In this paper, we present a novel method to constrain invexity on Neural...
research
10/12/2021

The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks

In this paper, we conjecture that if the permutation invariance of neura...
research
09/05/2020

Optimizing Mode Connectivity via Neuron Alignment

The loss landscapes of deep neural networks are not well understood due ...
research
09/11/2022

Git Re-Basin: Merging Models modulo Permutation Symmetries

The success of deep learning is thanks to our ability to solve certain m...
research
06/14/2019

Explaining Landscape Connectivity of Low-cost Solutions for Multilayer Nets

Mode connectivity is a surprising phenomenon in the loss landscape of de...

Please sign up or login with your details

Forgot password? Click here to reset