Exploiting Spline Models for the Training of Fully Connected Layers in Neural Network

02/12/2021
by   Kanya Mo, et al.
0

The fully connected (FC) layer, one of the most fundamental modules in artificial neural networks (ANN), is often considered difficult and inefficient to train due to issues including the risk of overfitting caused by its large amount of parameters. Based on previous work studying ANN from linear spline perspectives, we propose a spline-based approach that eases the difficulty of training FC layers. Given some dataset, we first obtain a continuous piece-wise linear (CPWL) fit through spline methods such as multivariate adaptive regression spline (MARS). Next, we construct an ANN model from the linear spline model and continue to train the ANN model on the dataset using gradient descent optimization algorithms. Our experimental results and theoretical analysis show that our approach reduces the computational cost, accelerates the convergence of FC layers, and significantly increases the interpretability of the resulting model (FC layers) compared with standard ANN training with random parameter initialization followed by gradient descent optimizations.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

02/19/2021

A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions

Gradient descent optimization algorithms are the standard ingredients th...
01/03/2022

A Mixed Integer Programming Approach to Training Dense Neural Networks

Artificial Neural Networks (ANNs) are prevalent machine learning models ...
07/15/2017

Evolutionary Training of Sparse Artificial Neural Networks: A Network Science Perspective

Through the success of deep learning, Artificial Neural Networks (ANNs) ...
02/10/2020

Pairwise Neural Networks (PairNets) with Low Memory for Fast On-Device Applications

A traditional artificial neural network (ANN) is normally trained slowly...
01/24/2020

PairNets: Novel Fast Shallow Artificial Neural Networks on Partitioned Subspaces

Traditionally, an artificial neural network (ANN) is trained slowly by a...
02/27/2021

Spline parameterization of neural network controls for deep learning

Based on the continuous interpretation of deep learning cast as an optim...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.