Structured Multi-Hashing for Model Compression

11/25/2019
by   Elad Eban, et al.
19

Despite the success of deep neural networks (DNNs), state-of-the-art models are too large to deploy on low-resource devices or common server configurations in which multiple models are held in memory. Model compression methods address this limitation by reducing the memory footprint, latency, or energy consumption of a model with minimal impact on accuracy. We focus on the task of reducing the number of learnable variables in the model. In this work we combine ideas from weight hashing and dimensionality reductions resulting in a simple and powerful structured multi-hashing method based on matrix products that allows direct control of model size of any deep network and is trained end-to-end. We demonstrate the strength of our approach by compressing models from the ResNet, EfficientNet, and MobileNet architecture families. Our method allows us to drastically decrease the number of variables while maintaining high accuracy. For instance, by applying our approach to EfficentNet-B4 (16M parameters) we reduce it to to the size of B0 (5M parameters), while gaining over 3 reduce the ResNet32 model by 75 10x compression while still achieving above 90

READ FULL TEXT
research
05/20/2016

Functional Hashing for Compressing Neural Networks

As the complexity of deep neural networks (DNNs) trend to grow to absorb...
research
10/21/2017

An efficient deep learning hashing neural network for mobile visual search

Mobile visual search applications are emerging that enable users to sens...
research
07/28/2020

Model Size Reduction Using Frequency Based Double Hashing for Recommender Systems

Deep Neural Networks (DNNs) with sparse input features have been widely ...
research
03/03/2022

Weightless Neural Networks for Efficient Edge Inference

Weightless Neural Networks (WNNs) are a class of machine learning model ...
research
11/04/2019

Deep Compressed Pneumonia Detection for Low-Power Embedded Devices

Deep neural networks (DNNs) have been expanded into medical fields and t...
research
11/10/2021

Self-Compression in Bayesian Neural Networks

Machine learning models have achieved human-level performance on various...
research
07/21/2022

Efficient model compression with Random Operation Access Specific Tile (ROAST) hashing

Advancements in deep learning are often associated with increasing model...

Please sign up or login with your details

Forgot password? Click here to reset