Compressing Models with Few Samples: Mimicking then Replacing

01/07/2022
by   Huanyu Wang, et al.
0

Few-sample compression aims to compress a big redundant model into a small compact one with only few samples. If we fine-tune models with these limited few samples directly, models will be vulnerable to overfit and learn almost nothing. Hence, previous methods optimize the compressed model layer-by-layer and try to make every layer have the same outputs as the corresponding layer in the teacher model, which is cumbersome. In this paper, we propose a new framework named Mimicking then Replacing (MiR) for few-sample compression, which firstly urges the pruned model to output the same features as the teacher's in the penultimate layer, and then replaces teacher's layers before penultimate with a well-tuned compact one. Unlike previous layer-wise reconstruction methods, our MiR optimizes the entire network holistically, which is not only simple and effective, but also unsupervised and general. MiR outperforms previous methods with large margins. Codes will be available soon.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/05/2018

Knowledge Distillation from Few Samples

Current knowledge distillation methods require full training data to dis...
research
10/04/2022

Less is More: Task-aware Layer-wise Distillation for Language Model Compression

Layer-wise distillation is a powerful tool to compress large models (i.e...
research
08/19/2020

Data-Independent Structured Pruning of Neural Networks via Coresets

Model compression is crucial for deployment of neural networks on device...
research
02/16/2022

Practical Network Acceleration with Tiny Sets

Network compression is effective in accelerating the inference of deep n...
research
07/11/2022

UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation

Most translation tasks among languages belong to the zero-resource trans...
research
10/29/2021

Model Fusion of Heterogeneous Neural Networks via Cross-Layer Alignment

Layer-wise model fusion via optimal transport, named OTFusion, applies s...
research
03/02/2023

Practical Network Acceleration with Tiny Sets: Hypothesis, Theory, and Algorithm

Due to data privacy issues, accelerating networks with tiny training set...

Please sign up or login with your details

Forgot password? Click here to reset