Only Train Once: A One-Shot Neural Network Training And Pruning Framework

07/15/2021
by   Tianyi Chen, et al.
18

Structured pruning is a commonly used technique in deploying deep neural networks (DNNs) onto resource-constrained devices. However, the existing pruning methods are usually heuristic, task-specified, and require an extra fine-tuning procedure. To overcome these limitations, we propose a framework that compresses DNNs into slimmer architectures with competitive performances and significant FLOPs reductions by Only-Train-Once (OTO). OTO contains two keys: (i) we partition the parameters of DNNs into zero-invariant groups, enabling us to prune zero groups without affecting the output; and (ii) to promote zero groups, we then formulate a structured-sparsity optimization problem and propose a novel optimization algorithm, Half-Space Stochastic Projected Gradient (HSPG), to solve it, which outperforms the standard proximal methods on group sparsity exploration and maintains comparable convergence. To demonstrate the effectiveness of OTO, we train and compress full models simultaneously from scratch without fine-tuning for inference speedup and parameter reduction, and achieve state-of-the-art results on VGG16 for CIFAR10, ResNet50 for CIFAR10/ImageNet and Bert for SQuAD.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2021

Feature Flow Regularization: Improving Structured Sparsity in Deep Neural Networks

Pruning is a model compression method that removes redundant parameters ...
research
03/13/2023

OTOV2: Automatic, Generic, User-Friendly

The existing model compression methods via structured pruning typically ...
research
02/18/2019

Single-shot Channel Pruning Based on Alternating Direction Method of Multipliers

Channel pruning has been identified as an effective approach to construc...
research
10/01/2021

Learning Compact Representations of Neural Networks using DiscriminAtive Masking (DAM)

A central goal in deep learning is to learn compact representations of f...
research
11/19/2019

Neural Network Pruning with Residual-Connections and Limited-Data

Filter level pruning is an effective method to accelerate the inference ...
research
01/23/2020

SS-Auto: A Single-Shot, Automatic Structured Weight Pruning Framework of DNNs with Ultra-High Efficiency

Structured weight pruning is a representative model compression techniqu...

Please sign up or login with your details

Forgot password? Click here to reset