Statistical-mechanical analysis of pre-training and fine tuning in deep learning

01/19/2015
by   Masayuki Ohzeki, et al.
0

In this paper, we present a statistical-mechanical analysis of deep learning. We elucidate some of the essential components of deep learning---pre-training by unsupervised learning and fine tuning by supervised learning. We formulate the extraction of features from the training data as a margin criterion in a high-dimensional feature-vector space. The self-organized classifier is then supplied with small amounts of labelled data, as in deep learning. Although we employ a simple single-layer perceptron model, rather than directly analyzing a multi-layer neural network, we find a nontrivial phase transition that is dependent on the number of unlabelled data in the generalization error of the resultant classifier. In this sense, we evaluate the efficacy of the unsupervised learning component of deep learning. The analysis is performed by the replica method, which is a sophisticated tool in statistical mechanics. We validate our result in the manner of deep learning, using a simple iterative algorithm to learn the weight vector on the basis of belief propagation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2023

Towards Better Web Search Performance: Pre-training, Fine-tuning and Learning to Rank

This paper describes the approach of the THUIR team at the WSDM Cup 2023...
research
05/25/2022

Memorization in NLP Fine-tuning Methods

Large language models are shown to present privacy risks through memoriz...
research
11/05/2019

MML: Maximal Multiverse Learning for Robust Fine-Tuning of Language Models

Recent state-of-the-art language models utilize a two-phase training pro...
research
10/03/2022

Self-omics: A Self-supervised Learning Framework for Multi-omics Cancer Data

We have gained access to vast amounts of multi-omics data thanks to Next...
research
04/19/2023

Optimizations of Autoencoders for Analysis and Classification of Microscopic In Situ Hybridization Images

Currently, analysis of microscopic In Situ Hybridization images is done ...
research
06/05/2016

What is the Best Feature Learning Procedure in Hierarchical Recognition Architectures?

(This paper was written in November 2011 and never published. It is post...
research
05/18/2022

Deep Features for CBIR with Scarce Data using Hebbian Learning

Features extracted from Deep Neural Networks (DNNs) have proven to be ve...

Please sign up or login with your details

Forgot password? Click here to reset