A stochastic optimization approach to train non-linear neural networks with a higher-order variation regularization

08/04/2023
by   Akifumi Okuno, et al.
0

While highly expressive parametric models including deep neural networks have an advantage to model complicated concepts, training such highly non-linear models is known to yield a high risk of notorious overfitting. To address this issue, this study considers a (k,q)th order variation regularization ((k,q)-VR), which is defined as the qth-powered integral of the absolute kth order derivative of the parametric models to be trained; penalizing the (k,q)-VR is expected to yield a smoother function, which is expected to avoid overfitting. Particularly, (k,q)-VR encompasses the conventional (general-order) total variation with q=1. While the (k,q)-VR terms applied to general parametric models are computationally intractable due to the integration, this study provides a stochastic optimization algorithm, that can efficiently train general models with the (k,q)-VR without conducting explicit numerical integration. The proposed approach can be applied to the training of even deep neural networks whose structure is arbitrary, as it can be implemented by only a simple stochastic gradient descent algorithm and automatic differentiation. Our numerical experiments demonstrate that the neural networks trained with the (k,q)-VR terms are more “resilient” than those with the conventional parameter regularization. The proposed algorithm also can be extended to the physics-informed training of neural networks (PINNs).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro