Revisiting Implicit Models: Sparsity Trade-offs Capability in Weight-tied Model for Vision Tasks

07/16/2023
by   Haobo Song, et al.
0

Implicit models such as Deep Equilibrium Models (DEQs) have garnered significant attention in the community for their ability to train infinite layer models with elegant solution-finding procedures and constant memory footprint. However, despite several attempts, these methods are heavily constrained by model inefficiency and optimization instability. Furthermore, fair benchmarking across relevant methods for vision tasks is missing. In this work, we revisit the line of implicit models and trace them back to the original weight-tied models. Surprisingly, we observe that weight-tied models are more effective, stable, as well as efficient on vision tasks, compared to the DEQ variants. Through the lens of these simple-yet-clean weight-tied models, we further study the fundamental limits in the model capacity of such models and propose the use of distinct sparse masks to improve the model capacity. Finally, for practitioners, we offer design guidelines regarding the depth, width, and sparsity selection for weight-tied models, and demonstrate the generalizability of our insights to other learning paradigms.

READ FULL TEXT

page 6

page 8

page 19

research
03/17/2023

LION: Implicit Vision Prompt Tuning

Despite recent competitive performance across a range of vision tasks, v...
research
02/15/2021

On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers

A deep equilibrium model uses implicit layers, which are implicitly defi...
research
01/15/2022

UDC: Unified DNAS for Compressible TinyML Models

Emerging Internet-of-things (IoT) applications are driving deployment of...
research
07/23/2020

WeightNet: Revisiting the Design Space of Weight Networks

We present a conceptually simple, flexible and effective framework for w...
research
07/10/2017

Revisiting Unreasonable Effectiveness of Data in Deep Learning Era

The success of deep learning in vision can be attributed to: (a) models ...
research
06/05/2018

Deep Gaussian Processes with Convolutional Kernels

Deep Gaussian processes (DGPs) provide a Bayesian non-parametric alterna...
research
09/19/2022

State-driven Implicit Modeling for Sparsity and Robustness in Neural Networks

Implicit models are a general class of learning models that forgo the hi...

Please sign up or login with your details

Forgot password? Click here to reset