DeepAI AI Chat
Log In Sign Up

Expressivity of Neural Networks via Chaotic Itineraries beyond Sharkovsky's Theorem

by   Clayton Sanford, et al.

Given a target function f, how large must a neural network be in order to approximate f? Recent works examine this basic question on neural network expressivity from the lens of dynamical systems and provide novel “depth-vs-width” tradeoffs for a large family of functions f. They suggest that such tradeoffs are governed by the existence of periodic points or cycles in f. Our work, by further deploying dynamical systems concepts, illuminates a more subtle connection between periodicity and expressivity: we prove that periodic points alone lead to suboptimal depth-width tradeoffs and we improve upon them by demonstrating that certain “chaotic itineraries” give stronger exponential tradeoffs, even in regimes where previous analyses only imply polynomial gaps. Contrary to prior works, our bounds are nearly-optimal, tighten as the period increases, and handle strong notions of inapproximability (e.g., constant L_1 error). More broadly, we identify a phase transition to the chaotic regime that exactly coincides with an abrupt shift in other notions of function complexity, including VC-dimension and topological entropy.


Better Depth-Width Trade-offs for Neural Networks through the lens of Dynamical Systems

The expressivity of neural networks as a function of their depth, width ...

Depth-Width Trade-offs for ReLU Networks via Sharkovsky's Theorem

Understanding the representational power of Deep Neural Networks (DNNs) ...

Depth-Width Trade-offs for Neural Networks via Topological Entropy

One of the central problems in the study of deep learning theory is to u...

On Scrambling Phenomena for Randomly Initialized Recurrent Networks

Recurrent Neural Networks (RNNs) frequently exhibit complicated dynamics...

Block Diagonally Dominant Positive Definite Sub-optimal Filters and Smoothers

We examine stochastic dynamical systems where the transition matrix, Φ, ...

Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient

The strong lottery ticket hypothesis (LTH) postulates that one can appro...