Trainable Spectrally Initializable Matrix Transformations in Convolutional Neural Networks

11/12/2019
by   Michele, et al.
0

In this work, we investigate the application of trainable and spectrally initializable matrix transformations on the feature maps produced by convolution operations. While previous literature has already demonstrated the possibility of adding static spectral transformations as feature processors, our focus is on more general trainable transforms. We study the transforms in various architectural configurations on four datasets of different nature: from medical (ColorectalHist, HAM10000) and natural (Flowers, ImageNet) images to historical documents (CB55) and handwriting recognition (GPDS). With rigorous experiments that control for the number of parameters and randomness, we show that networks utilizing the introduced matrix transformations outperform vanilla neural networks. The observed accuracy increases by an average of 2.2 across all datasets. In addition, we show that the benefit of spectral initialization leads to significantly faster convergence, as opposed to randomly initialized matrix transformations. The transformations are implemented as auto-differentiable PyTorch modules that can be incorporated into any neural network architecture. The entire code base is open-source.

READ FULL TEXT
research
11/27/2019

GhostNet: More Features from Cheap Operations

Deploying convolutional neural networks (CNNs) on embedded devices is di...
research
06/24/2019

Mixup of Feature Maps in a Hidden Layer for Training of Convolutional Neural Network

The deep Convolutional Neural Network (CNN) became very popular as a fun...
research
02/10/2022

Feature-level augmentation to improve robustness of deep neural networks to affine transformations

Recent studies revealed that convolutional neural networks do not genera...
research
02/23/2020

Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks

Deep neural networks often consist of a great number of trainable parame...
research
04/11/2019

Compressing deep neural networks by matrix product operators

A deep neural network is a parameterization of a multi-layer mapping of ...
research
12/16/2017

An Artificial Neural Network Architecture Based on Context Transformations in Cortical Minicolumns

Cortical minicolumns are considered a model of cortical organization. Th...
research
08/13/2022

Massively Parallel Universal Linear Transformations using a Wavelength-Multiplexed Diffractive Optical Network

We report deep learning-based design of a massively parallel broadband d...

Please sign up or login with your details

Forgot password? Click here to reset