Model Compression

05/20/2021 ∙ by Arhum Ishtiaq, et al. ∙ 10

With time, machine learning models have increased in their scope, functionality and size. Consequently, the increased functionality and size of such models requires high-end hardware to both train and provide inference after the fact. This paper aims to explore the possibilities within the domain of model compression and discuss the efficiency of each of the possible approaches while comparing model size and performance with respect to pre- and post-compression.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

Code Repositories

micronet

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorr


view repo

model_compression

Implementation of model compression with knowledge distilling method.


view repo

Keras_model_compression

Model Compression Based on Geoffery Hinton's Logit Regression Method in Keras applied to MNIST 16x compression over 0.95 percent accuracy.An Implementation of "Distilling the Knowledge in a Neural Network - Geoffery Hinton et. al"


view repo

model_compression

deep learning model compression based on keras


view repo

ModelCompression

Weight and filter Pruning based model compression of YOLO v2 object detection network


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.