Efficient DNN Training with Knowledge-Guided Layer Freezing

01/17/2022
by   Yiding Wang, et al.
0

Training deep neural networks (DNNs) is time-consuming. While most existing solutions try to overlap/schedule computation and communication for efficient training, this paper goes one step further by skipping computing and communication through DNN layer freezing. Our key insight is that the training progress of internal DNN layers differs significantly, and front layers often become well-trained much earlier than deep layers. To explore this, we first introduce the notion of training plasticity to quantify the training progress of internal DNN layers. Then we design KGT, a knowledge-guided DNN training system that employs semantic knowledge from a reference model to accurately evaluate individual layers' training plasticity and safely freeze the converged ones, saving their corresponding backward computation and communication. Our reference model is generated on the fly using quantization techniques and runs forward operations asynchronously on available CPUs to minimize the overhead. In addition, KGT caches the intermediate outputs of the frozen layers with prefetching to further skip the forward computation. Our implementation and testbed experiments with popular vision and language models show that KGT achieves 19 sacrificing accuracy.

READ FULL TEXT

page 11

page 20

research
06/08/2018

PipeDream: Fast and Efficient Pipeline Parallel DNN Training

PipeDream is a Deep Neural Network(DNN) training system for GPUs that pa...
research
11/05/2021

Visualizing the Emergence of Intermediate Visual Patterns in DNNs

This paper proposes a method to visualize the discrimination power of in...
research
11/27/2020

Progressively Stacking 2.0: A Multi-stage Layerwise Training Method for BERT Training Speedup

Pre-trained language models, such as BERT, have achieved significant acc...
research
11/27/2019

Optimal checkpointing for heterogeneous chains: how to train deep neural networks with limited memory

This paper introduces a new activation checkpointing method which allows...
research
03/03/2021

Self-Checking Deep Neural Networks in Deployment

The widespread adoption of Deep Neural Networks (DNNs) in important doma...
research
08/09/2023

SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference

Deep neural networks (DNNs) demonstrate outstanding performance across m...

Please sign up or login with your details

Forgot password? Click here to reset