Model Fusion via Optimal Transport

10/12/2019

∙

Combining different models is a widely used paradigm in machine learning applications. While the most common approach is to form an ensemble of models and average their individual predictions, this approach is often rendered infeasible by given resource constraints in terms of memory and computation, which grow linearly with the number of models. We present a layer-wise model fusion procedure for neural networks that utilizes optimal transport to (soft-) align neurons across the models before averaging their associated parameters. We discuss two main algorithms for fusing neural networks in this "one-shot" manner, without requiring any retraining. Finally, we illustrate on CIFAR10 and MNIST how this significantly outperforms vanilla averaging on convolutional networks, such as VGG11 and multi-layer perceptrons, and for transfer tasks even surpasses the performance of both original models.

READ FULL TEXT

Model Fusion via Optimal Transport

Sign in with Google

Consider DeepAI Pro