
Energy Propagation in Deep Convolutional Neural Networks
Many practical machine learning tasks employ very deep convolutional neu...
read it

A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction
Deep convolutional neural networks have led to breakthrough results in n...
read it

Squeezed Very Deep Convolutional Neural Networks for Text Classification
Most of the research in convolutional neural networks has focused on inc...
read it

Deep Learning in the Wavelet Domain
This paper examines the possibility of, and the possible advantages to l...
read it

Discrete Deep Feature Extraction: A Theory and New Architectures
First steps towards a mathematical theory of deep convolutional neural n...
read it

Building Function Approximators on top of Haar Scattering Networks
In this article we propose building generalpurpose function approximato...
read it

Three dimensional waveguideinterconnects for scalable integration of photonic neural networks
Photonic waveguides are prime candidates for integrated and parallel pho...
read it
Topology Reduction in Deep Convolutional Feature Extraction Networks
Deep convolutional neural networks (CNNs) used in practice employ potentially hundreds of layers and 10,000s of nodes. Such network sizes entail significant computational complexity due to the large number of convolutions that need to be carried out; in addition, a large number of parameters needs to be learned and stored. Very deep and wide CNNs may therefore not be well suited to applications operating under severe resource constraints as is the case, e.g., in lowpower embedded and mobile platforms. This paper aims at understanding the impact of CNN topology, specifically depth and width, on the network's feature extraction capabilities. We address this question for the class of scattering networks that employ either WeylHeisenberg filters or wavelets, the modulus nonlinearity, and no pooling. The exponential feature map energy decay results in Wiatowski et al., 2017, are generalized to O(a^N), where an arbitrary decay factor a>1 can be realized through suitable choice of the WeylHeisenberg prototype function or the mother wavelet. We then show how networks of fixed (possibly small) depth N can be designed to guarantee that ((1ε)· 100)% of the input signal's energy are contained in the feature vector. Based on the notion of operationally significant nodes, we characterize, partly rigorously and partly heuristically, the topologyreducing effects of (effectively) bandlimited input signals, bandlimited filters, and feature map symmetries. Finally, for networks based on WeylHeisenberg filters, we determine the prototype function bandwidth that minimizesfor fixed network depth Nthe average number of operationally significant nodes per layer.
READ FULL TEXT
Comments
There are no comments yet.