Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture

by   Libin Zhu, et al.

In this paper we show that feedforward neural networks corresponding to arbitrary directed acyclic graphs undergo transition to linearity as their "width" approaches infinity. The width of these general networks is characterized by the minimum in-degree of their neurons, except for the input and first layers. Our results identify the mathematical structure underlying transition to linearity and generalize a number of recent works aimed at characterizing transition to linearity or constancy of the Neural Tangent Kernel for standard architectures.



page 1

page 2

page 3

page 4


Directed Graph Embeddings

Definitions of graph embeddings and graph minors for directed graphs are...

Structural Analysis of Sparse Neural Networks

Sparse Neural Networks regained attention due to their potential for mat...

On the linearity of large non-linear models: when and why the tangent kernel is constant

The goal of this work is to shed light on the remarkable phenomenon of t...

Acyclic coloring of special digraphs

An acyclic r-coloring of a directed graph G=(V,E) is a partition of the ...

A Capsule-unified Framework of Deep Neural Networks for Graphical Programming

Recently, the growth of deep learning has produced a large number of dee...

Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models

Wide neural networks with linear output layer have been shown to be near...

Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? – A Neural Tangent Kernel Perspective

Deep residual networks (ResNets) have demonstrated better generalization...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.