A theoretical framework for deep locally connected ReLU network

09/28/2018
by   Yuandong Tian, et al.
6

Understanding theoretical properties of deep and locally connected nonlinear network, such as deep convolutional neural network (DCNN), is still a hard problem despite its empirical success. In this paper, we propose a novel theoretical framework for such networks with ReLU nonlinearity. The framework explicitly formulates data distribution, favors disentangled representations and is compatible with common regularization techniques such as Batch Norm. The framework is built upon teacher-student setting, by expanding the student forward/backward propagation onto the teacher's computational graph. The resulting model does not impose unrealistic assumptions (e.g., Gaussian inputs, independence of activation, etc). Our framework could help facilitate theoretical analysis of many practical issues, e.g. overfitting, generalization, disentangled representations in deep networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2021

On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting

Deep learning empirically achieves high performance in many applications...
research
05/31/2019

Luck Matters: Understanding Training Dynamics of Deep ReLU Networks

We analyze the dynamics of training deep ReLU networks and their implica...
research
12/17/2018

Learning Student Networks via Feature Embedding

Deep convolutional neural networks have been widely used in numerous app...
research
05/30/2022

Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel Methods

While deep learning has outperformed other methods for various tasks, th...
research
01/19/2020

Optimal Rate of Convergence for Deep Neural Network Classifiers under the Teacher-Student Setting

Classifiers built with neural networks handle large-scale high-dimension...
research
06/01/2020

The Effects of Mild Over-parameterization on the Optimization Landscape of Shallow ReLU Neural Networks

We study the effects of mild over-parameterization on the optimization l...
research
10/19/2011

Readouts for Echo-state Networks Built using Locally Regularized Orthogonal Forward Regression

Echo state network (ESN) is viewed as a temporal non-orthogonal expansio...

Please sign up or login with your details

Forgot password? Click here to reset