Spectral Regularization Allows Data-frugal Learning over Combinatorial Spaces

10/05/2022
by   Amirali Aghazadeh, et al.
0

Data-driven machine learning models are being increasingly employed in several important inference problems in biology, chemistry, and physics which require learning over combinatorial spaces. Recent empirical evidence (see, e.g., [1], [2], [3]) suggests that regularizing the spectral representation of such models improves their generalization power when labeled data is scarce. However, despite these empirical studies, the theoretical underpinning of when and how spectral regularization enables improved generalization is poorly understood. In this paper, we focus on learning pseudo-Boolean functions and demonstrate that regularizing the empirical mean squared error by the L_1 norm of the spectral transform of the learned function reshapes the loss landscape and allows for data-frugal learning, under a restricted secant condition on the learner's empirical error measured against the ground truth function. Under a weaker quadratic growth condition, we show that stationary points which also approximately interpolate the training data points achieve statistically optimal generalization performance. Complementing our theory, we empirically demonstrate that running gradient descent on the regularized loss results in a better generalization performance compared to baseline algorithms in several data-scarce real-world problems.

READ FULL TEXT
research
12/30/2020

Risk Guarantees for End-to-End Prediction and Optimization Processes

Prediction models are often employed in estimating parameters of optimiz...
research
12/03/2019

Stationary Points of Shallow Neural Networks with Quadratic Activation Function

We consider the problem of learning shallow neural networks with quadrat...
research
03/31/2017

The Risk of Machine Learning

Many applied settings in empirical economics involve simultaneous estima...
research
08/25/2019

Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization

While deep learning is successful in a number of applications, it is not...
research
09/11/2019

Automated Spectral Kernel Learning

The generalization performance of kernel methods is largely determined b...
research
12/05/2012

On the Convergence Properties of Optimal AdaBoost

AdaBoost is one of the most popular machine-learning algorithms. It is s...
research
05/26/2022

Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures

This paper considers the Pointer Value Retrieval (PVR) benchmark introdu...

Please sign up or login with your details

Forgot password? Click here to reset