Parameter-Efficient Masking Networks

10/13/2022
by   Yue Bai, et al.
0

A deeper network structure generally handles more complicated non-linearity and performs more competitively. Nowadays, advanced network designs often contain a large number of repetitive structures (e.g., Transformer). They empower the network capacity to a new level but also increase the model size inevitably, which is unfriendly to either model restoring or transferring. In this study, we are the first to investigate the representative potential of fixed random weights with limited unique values by learning diverse masks and introduce the Parameter-Efficient Masking Networks (PEMN). It also naturally leads to a new paradigm for model compression to diminish the model size. Concretely, motivated by the repetitive structures in modern neural networks, we utilize one random initialized layer, accompanied with different masks, to convey different feature mappings and represent repetitive network modules. Therefore, the model can be expressed as one-layer with a bunch of masks, which significantly reduce the model storage cost. Furthermore, we enhance our strategy by learning masks for a model filled by padding a given random weights vector. In this way, our method can further lower the space complexity, especially for models without many repetitive architectures. We validate the potential of PEMN learning masks on random weights with limited unique values and test its effectiveness for a new compression paradigm based on different network architectures. Code is available at https://github.com/yueb17/PEMN

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2022

PRANC: Pseudo RAndom Networks for Compacting deep models

Communication becomes a bottleneck in various distributed Machine Learni...
research
01/16/2020

MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy?

Binary Neural Networks (BNNs) are neural networks which use binary weigh...
research
05/24/2022

History Compression via Language Models in Reinforcement Learning

In a partially observable Markov decision process (POMDP), an agent typi...
research
09/08/2021

What's Hidden in a One-layer Randomly Weighted Transformer?

We demonstrate that, hidden within one-layer randomly weighted neural ne...
research
04/06/2022

LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification

We introduce LilNetX, an end-to-end trainable technique for neural netwo...
research
02/27/2023

Permutation Equivariant Neural Functionals

This work studies the design of neural networks that can process the wei...
research
03/28/2023

Randomly Initialized Subnetworks with Iterative Weight Recycling

The Multi-Prize Lottery Ticket Hypothesis posits that randomly initializ...

Please sign up or login with your details

Forgot password? Click here to reset