Towards a Universal Gating Network for Mixtures of Experts

11/03/2020
by   Chen Wen Kang, et al.
0

The combination and aggregation of knowledge from multiple neural networks can be commonly seen in the form of mixtures of experts. However, such combinations are usually done using networks trained on the same tasks, with little mention of the combination of heterogeneous pre-trained networks, especially in the data-free regime. This paper proposes multiple data-free methods for the combination of heterogeneous neural networks, ranging from the utilization of simple output logit statistics, to training specialized gating networks. The gating networks decide whether specific inputs belong to specific networks based on the nature of the expert activations generated. The experiments revealed that the gating networks, including the universal gating approach, constituted the most accurate approach, and therefore represent a pragmatic step towards applications with heterogeneous mixtures of experts in a data-free regime. The code for this project is hosted on github at https://github.com/cwkang1998/network-merging.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/21/2022

Merging of neural networks

We propose a simple scheme for merging two neural networks trained with ...
research
05/28/2023

Emergent Modularity in Pre-trained Transformers

This work examines the presence of modularity in pre-trained Transformer...
research
04/06/2023

Graph Mixture of Experts: Learning on Large-Scale Graphs with Explicit Diversity Modeling

Graph neural networks (GNNs) have been widely applied to learning over g...
research
07/06/2022

Simple and Efficient Heterogeneous Graph Neural Network

Heterogeneous graph neural networks (HGNNs) deliver the powerful capabil...
research
01/14/2021

Neural networks behave as hash encoders: An empirical study

The input space of a neural network with ReLU-like activations is partit...
research
06/12/2020

OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page Text Recognition by learning to unfold

Text recognition is a major computer vision task with a big set of assoc...
research
07/03/2021

Pool of Experts: Realtime Querying Specialized Knowledge in Massive Neural Networks

In spite of the great success of deep learning technologies, training an...

Please sign up or login with your details

Forgot password? Click here to reset