Optimization on Submanifolds of Convolution Kernels in CNNs

10/22/2016
by   M. Ozay, et al.
0

Kernel normalization methods have been employed to improve robustness of optimization methods to reparametrization of convolution kernels, covariate shift, and to accelerate training of Convolutional Neural Networks (CNNs). However, our understanding of theoretical properties of these methods has lagged behind their success in applications. We develop a geometric framework to elucidate underlying mechanisms of a diverse range of kernel normalization methods. Our framework enables us to expound and identify geometry of space of normalized kernels. We analyze and delineate how state-of-the-art kernel normalization methods affect the geometry of search spaces of the stochastic gradient descent (SGD) algorithms in CNNs. Following our theoretical results, we propose a SGD algorithm with assurance of almost sure convergence of the methods to a solution at single minimum of classification loss of CNNs. Experimental results show that the proposed method achieves state-of-the-art performance for major image classification benchmarks with CNNs.

READ FULL TEXT

page 2

page 3

research
01/22/2017

Optimization on Product Submanifolds of Convolution Kernels

Recent advances in optimization methods used for training convolutional ...
research
11/30/2015

Design of Kernels in Convolutional Neural Networks for Image Classification

Despite the effectiveness of Convolutional Neural Networks (CNNs) for im...
research
05/26/2015

Accelerating Very Deep Convolutional Networks for Classification and Detection

This paper aims to accelerate the test-time computation of convolutional...
research
07/19/2022

Moment Centralization based Gradient Descent Optimizers for Convolutional Neural Networks

Convolutional neural networks (CNNs) have shown very appealing performan...
research
11/16/2021

Learning with convolution and pooling operations in kernel methods

Recent empirical work has shown that hierarchical convolutional kernels ...
research
11/18/2019

Grassmannian Packings in Neural Networks: Learning with Maximal Subspace Packings for Diversity and Anti-Sparsity

Kernel sparsity ("dying ReLUs") and lack of diversity are commonly obser...
research
08/01/2016

Learning Semantically Coherent and Reusable Kernels in Convolution Neural Nets for Sentence Classification

The state-of-the-art CNN models give good performance on sentence classi...

Please sign up or login with your details

Forgot password? Click here to reset