Exploiting Spline Models for the Training of Fully Connected Layers in Neural Network

by   Kanya Mo, et al.

The fully connected (FC) layer, one of the most fundamental modules in artificial neural networks (ANN), is often considered difficult and inefficient to train due to issues including the risk of overfitting caused by its large amount of parameters. Based on previous work studying ANN from linear spline perspectives, we propose a spline-based approach that eases the difficulty of training FC layers. Given some dataset, we first obtain a continuous piece-wise linear (CPWL) fit through spline methods such as multivariate adaptive regression spline (MARS). Next, we construct an ANN model from the linear spline model and continue to train the ANN model on the dataset using gradient descent optimization algorithms. Our experimental results and theoretical analysis show that our approach reduces the computational cost, accelerates the convergence of FC layers, and significantly increases the interpretability of the resulting model (FC layers) compared with standard ANN training with random parameter initialization followed by gradient descent optimizations.


page 1

page 2

page 3

page 4


A Mixed Integer Programming Approach to Training Dense Neural Networks

Artificial Neural Networks (ANNs) are prevalent machine learning models ...

Evolutionary Training of Sparse Artificial Neural Networks: A Network Science Perspective

Through the success of deep learning, Artificial Neural Networks (ANNs) ...

Improved Convergence Guarantees for Shallow Neural Networks

We continue a long line of research aimed at proving convergence of dept...

𝒞^k-continuous Spline Approximation with TensorFlow Gradient Descent Optimizers

In this work we present an "out-of-the-box" application of Machine Learn...

Pairwise Neural Networks (PairNets) with Low Memory for Fast On-Device Applications

A traditional artificial neural network (ANN) is normally trained slowly...

PairNets: Novel Fast Shallow Artificial Neural Networks on Partitioned Subspaces

Traditionally, an artificial neural network (ANN) is trained slowly by a...

Please sign up or login with your details

Forgot password? Click here to reset