Knowledge Transfer Pre-training

06/07/2015
by   Zhiyuan Tang, et al.
0

Pre-training is crucial for learning deep neural networks. Most of existing pre-training methods train simple models (e.g., restricted Boltzmann machines) and then stack them layer by layer to form the deep structure. This layer-wise pre-training has found strong theoretical foundation and broad empirical support. However, it is not easy to employ such method to pre-train models without a clear multi-layer structure,e.g., recurrent neural networks (RNNs). This paper presents a new pre-training approach based on knowledge transfer learning. In contrast to the layer-wise approach which trains model components incrementally, the new approach trains the entire model as a whole but with an easier objective function. This is achieved by utilizing soft targets produced by a prior trained model (teacher model). Compared to the conventional layer-wise methods, this new method does not care about the model structure, so can be used to pre-train very complex models. Experiments on a speech recognition task demonstrated that with this approach, complex RNNs can be well trained with a weaker deep neural network (DNN) model. Furthermore, the new method can be combined with conventional layer-wise pre-training to deliver additional gains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2015

Recurrent Neural Network Training with Dark Knowledge Transfer

Recurrent neural networks (RNNs), particularly long short-term memory (L...
research
06/07/2023

Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak

In this paper, we are comparing several methods of training the Slovak s...
research
07/09/2019

A Deep Neural Network for Finger Counting and Numerosity Estimation

In this paper, we present neuro-robotics models with a deep artificial n...
research
07/22/2022

Hyper-Representations for Pre-Training and Transfer Learning

Learning representations of neural network weights given a model zoo is ...
research
05/29/2020

Machine learning in spectral domain

Deep neural networks are usually trained in the space of the nodes, by a...
research
07/22/2018

Deep Discriminative Model for Video Classification

This paper presents a new deep learning approach for video-based scene c...
research
11/18/2015

Net2Net: Accelerating Learning via Knowledge Transfer

We introduce techniques for rapidly transferring the information stored ...

Please sign up or login with your details

Forgot password? Click here to reset