A Modern Self-Referential Weight Matrix That Learns to Modify Itself

02/11/2022
by   Kazuki Irie, et al.
11

The weight matrix (WM) of a neural network (NN) is its program. The programs of many traditional NNs are learned through gradient descent in some error function, then remain fixed. The WM of a self-referential NN, however, can keep rapidly modifying all of itself during runtime. In principle, such NNs can meta-learn to learn, and meta-meta-learn to meta-learn to learn, and so on, in the sense of recursive self-improvement. While NN architectures potentially capable of implementing such behavior have been proposed since the '90s, there have been few if any practical studies. Here we revisit such NNs, building upon recent successes of fast weight programmers and closely related linear Transformers. We propose a scalable self-referential WM (SRWM) that uses outer products and the delta update rule to modify itself. We evaluate our SRWM in supervised few-shot learning and in multi-task reinforcement learning with procedurally generated game environments. Our experiments demonstrate both practical applicability and competitive performance of the proposed SRWM. Our code is public.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2021

Going Beyond Linear Transformers with Recurrent Fast Weight Programmers

Transformers with linearised attention ("linear Transformers") have demo...
research
10/07/2022

Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules

Work on fast weight programmers has demonstrated the effectiveness of ke...
research
05/02/2023

Accelerating Neural Self-Improvement via Bootstrapping

Few-shot learning with sequence-processing neural networks (NNs) has rec...
research
12/29/2020

Meta Learning Backpropagation And Improving It

Many concepts have been proposed for meta learning with neural networks ...
research
07/18/2020

MTL2L: A Context Aware Neural Optimiser

Learning to learn (L2L) trains a meta-learner to assist the learning of ...
research
02/08/2020

ML-misfit: Learning a robust misfit function for full-waveform inversion using machine learning

Most of the available advanced misfit functions for full waveform invers...
research
01/19/2021

Variance Based Samples Weighting for Supervised Deep Learning

In the context of supervised learning of a function by a Neural Network ...

Please sign up or login with your details

Forgot password? Click here to reset