Learning to Learn with Generative Models of Neural Network Checkpoints

09/26/2022
by   William Peebles, et al.
44

We explore a data-driven approach for learning to optimize neural networks. We construct a dataset of neural network checkpoints and train a generative model on the parameters. In particular, our model is a conditional diffusion transformer that, given an initial input parameter vector and a prompted loss, error, or return, predicts the distribution over parameter updates that achieve the desired metric. At test time, it can optimize neural networks with unseen parameters for downstream tasks in just one update. We find that our approach successfully generates parameters for a wide range of loss prompts. Moreover, it can sample multimodal parameter solutions and has favorable scaling properties. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.

READ FULL TEXT

page 8

page 14

page 16

research
06/11/2019

Weight Agnostic Neural Networks

Not all neural network architectures are created equal, some perform muc...
research
05/24/2018

Autonomously and Simultaneously Refining Deep Neural Network Parameters by Generative Adversarial Networks

The choice of parameters, and the design of the network architecture are...
research
02/08/2023

Unsupervised Learning of Initialization in Deep Neural Networks via Maximum Mean Discrepancy

Despite the recent success of stochastic gradient descent in deep learni...
research
05/28/2023

LowDINO – A Low Parameter Self Supervised Learning Model

This research aims to explore the possibility of designing a neural netw...
research
01/30/2019

HyperGAN: A Generative Model for Diverse, Performant Neural Networks

We introduce HyperGAN, a generative network that learns to generate all ...
research
08/29/2022

Tackling Multimodal Device Distributions in Inverse Photonic Design using Invertible Neural Networks

Inverse design, the process of matching a device or process parameters t...
research
06/02/2021

A Generalizable Approach to Learning Optimizers

A core issue with learning to optimize neural networks has been the lack...

Please sign up or login with your details

Forgot password? Click here to reset