Feature learning in neural networks and kernel machines that recursively learn features

Neural networks have achieved impressive results on many technological and scientific tasks. Yet, their empirical successes have outpaced our fundamental understanding of their structure and function. By identifying mechanisms driving the successes of neural networks, we can provide principled approaches for improving neural network performance and develop simple and effective alternatives. In this work, we isolate the key mechanism driving feature learning in fully connected neural networks by connecting neural feature learning to the average gradient outer product. We subsequently leverage this mechanism to design \textit{Recursive Feature Machines} (RFMs), which are kernel machines that learn features. We show that RFMs (1) accurately capture features learned by deep fully connected neural networks, (2) close the gap between kernel machines and fully connected networks, and (3) surpass a broad spectrum of models including neural networks on tabular data. Furthermore, we demonstrate that RFMs shed light on recently observed deep learning phenomena such as grokking, lottery tickets, simplicity biases, and spurious features. We provide a Python implementation to make our method broadly accessible [\href{https://github.com/aradha/recursive_feature_machines}{GitHub}].

READ FULL TEXT

page 2

page 6

page 11

page 12

page 30

page 31

research
09/01/2023

Mechanism of feature learning in convolutional neural networks

Understanding the mechanism of how convolutional neural networks learn f...
research
02/11/2020

Neural Rule Ensembles: Encoding Sparse Feature Interactions into Neural Networks

Artificial Neural Networks form the basis of very powerful learning meth...
research
11/29/2020

Improving Neural Network with Uniform Sparse Connectivity

Neural network forms the foundation of deep learning and numerous AI app...
research
06/24/2022

Learning sparse features can lead to overfitting in neural networks

It is widely believed that the success of deep networks lies in their ab...
research
06/10/2021

Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics

Several works in implicit and explicit generative modeling empirically o...
research
03/28/2023

On Feature Scaling of Recursive Feature Machines

In this technical report, we explore the behavior of Recursive Feature M...
research
06/07/2023

On the Joint Interaction of Models, Data, and Features

Learning features from data is one of the defining characteristics of de...

Please sign up or login with your details

Forgot password? Click here to reset