Path Independent Equilibrium Models Can Better Exploit Test-Time Computation

11/18/2022
by   Cem Anil, et al.
0

Designing networks capable of attaining better performance with an increased inference budget is important to facilitate generalization to harder problem instances. Recent efforts have shown promising results in this direction by making use of depth-wise recurrent networks. We show that a broad class of architectures named equilibrium models display strong upwards generalization, and find that stronger performance on harder examples (which require more iterations of inference to get correct) strongly correlates with the path independence of the system – its tendency to converge to the same steady-state behaviour regardless of initialization, given enough computation. Experimental interventions made to promote path independence result in improved generalization on harder problem instances, while those that penalize it degrade this ability. Path independence analyses are also useful on a per-example basis: for equilibrium models that have good in-distribution performance, path independence on out-of-distribution samples strongly correlates with accuracy. Our results help explain why equilibrium models are capable of strong upwards generalization and motivates future work that harnesses path independence as a general modelling principle to facilitate scalable test-time usage.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2021

Deep Equilibrium Architectures for Inverse Problems in Imaging

Recent efforts on solving inverse problems in imaging via deep neural ne...
research
06/08/2021

Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks

Deep neural networks are powerful machines for visual pattern recognitio...
research
01/15/2023

EENet: Learning to Early Exit for Adaptive Inference

Budgeted adaptive inference with early exits is an emerging technique to...
research
05/23/2023

A Rank-Based Sequential Test of Independence

We consider the problem of independence testing for two univariate rando...
research
07/17/2021

Visual Representation Learning Does Not Generalize Strongly Within the Same Domain

An important component for generalization in machine learning is to unco...
research
08/17/2019

Improved Techniques for Training Adaptive Deep Networks

Adaptive inference is a promising technique to improve the computational...
research
04/07/2021

Scaling Scaling Laws with Board Games

The largest experiments in machine learning now require resources far be...

Please sign up or login with your details

Forgot password? Click here to reset