Exploiting Spline Models for the Training of Fully Connected Layers in Neural Network

02/12/2021
by   Kanya Mo, et al.
0

The fully connected (FC) layer, one of the most fundamental modules in artificial neural networks (ANN), is often considered difficult and inefficient to train due to issues including the risk of overfitting caused by its large amount of parameters. Based on previous work studying ANN from linear spline perspectives, we propose a spline-based approach that eases the difficulty of training FC layers. Given some dataset, we first obtain a continuous piece-wise linear (CPWL) fit through spline methods such as multivariate adaptive regression spline (MARS). Next, we construct an ANN model from the linear spline model and continue to train the ANN model on the dataset using gradient descent optimization algorithms. Our experimental results and theoretical analysis show that our approach reduces the computational cost, accelerates the convergence of FC layers, and significantly increases the interpretability of the resulting model (FC layers) compared with standard ANN training with random parameter initialization followed by gradient descent optimizations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2021

A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions

Gradient descent optimization algorithms are the standard ingredients th...
research
01/03/2022

A Mixed Integer Programming Approach to Training Dense Neural Networks

Artificial Neural Networks (ANNs) are prevalent machine learning models ...
research
07/15/2017

Evolutionary Training of Sparse Artificial Neural Networks: A Network Science Perspective

Through the success of deep learning, Artificial Neural Networks (ANNs) ...
research
12/05/2022

Improved Convergence Guarantees for Shallow Neural Networks

We continue a long line of research aimed at proving convergence of dept...
research
03/22/2023

𝒞^k-continuous Spline Approximation with TensorFlow Gradient Descent Optimizers

In this work we present an "out-of-the-box" application of Machine Learn...
research
02/10/2020

Pairwise Neural Networks (PairNets) with Low Memory for Fast On-Device Applications

A traditional artificial neural network (ANN) is normally trained slowly...
research
01/24/2020

PairNets: Novel Fast Shallow Artificial Neural Networks on Partitioned Subspaces

Traditionally, an artificial neural network (ANN) is trained slowly by a...

Please sign up or login with your details

Forgot password? Click here to reset