Net2Net: Accelerating Learning via Knowledge Transfer

11/18/2015
by   Tianqi Chen, et al.
0

We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of function-preserving transformations between neural network specifications. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/28/2022

Breaking the Architecture Barrier: A Method for Efficient Knowledge Transfer Across Networks

Transfer learning is a popular technique for improving the performance o...
research
04/27/2018

CompNet: Neural networks growing via the compact network morphism

It is often the case that the performance of a neural network can be imp...
research
05/19/2021

Unsupervised Discriminative Learning of Sounds for Audio Event Classification

Recent progress in network-based audio event classification has shown th...
research
01/24/2019

Is Pretraining Necessary for Hyperspectral Image Classification?

We address two questions for training a convolutional neural network (CN...
research
06/07/2015

Knowledge Transfer Pre-training

Pre-training is crucial for learning deep neural networks. Most of exist...
research
06/10/2021

Supervising the Transfer of Reasoning Patterns in VQA

Methods for Visual Question Anwering (VQA) are notorious for leveraging ...
research
03/17/2023

Towards a Foundation Model for Neural Network Wavefunctions

Deep neural networks have become a highly accurate and powerful wavefunc...

Please sign up or login with your details

Forgot password? Click here to reset