Distillation Guided Residual Learning for Binary Convolutional Neural Networks

07/10/2020
by   Jianming Ye, et al.
0

It is challenging to bridge the performance gap between Binary CNN (BCNN) and Floating point CNN (FCNN). We observe that, this performance gap leads to substantial residuals between intermediate feature maps of BCNN and FCNN. To minimize the performance gap, we enforce BCNN to produce similar intermediate feature maps with the ones of FCNN. This training strategy, i.e., optimizing each binary convolutional block with block-wise distillation loss derived from FCNN, leads to a more effective optimization to BCNN. It also motivates us to update the binary convolutional block architecture to facilitate the optimization of block-wise distillation loss. Specifically, a lightweight shortcut branch is inserted into each binary convolutional block to complement residuals at each block. Benefited from its Squeeze-and-Interaction (SI) structure, this shortcut branch introduces a fraction of parameters, e.g., 10% overheads, but effectively complements the residuals. Extensive experiments on ImageNet demonstrate the superior performance of our method in both classification efficiency and accuracy, e.g., BCNN trained with our methods achieves the accuracy of 60.45% on ImageNet.

READ FULL TEXT
research
02/21/2020

Residual Knowledge Distillation

Knowledge distillation (KD) is one of the most potent ways for model com...
research
01/16/2020

MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy?

Binary Neural Networks (BNNs) are neural networks which use binary weigh...
research
03/23/2023

A Simple and Generic Framework for Feature Distillation via Channel-wise Transformation

Knowledge distillation is a popular technique for transferring the knowl...
research
08/21/2019

RBCN: Rectified Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs

Binarized convolutional neural networks (BCNNs) are widely used to impro...
research
02/16/2023

URCDC-Depth: Uncertainty Rectified Cross-Distillation with CutFlip for Monocular Depth Estimation

This work aims to estimate a high-quality depth map from a single RGB im...
research
04/01/2021

Embedded Self-Distillation in Compact Multi-Branch Ensemble Network for Remote Sensing Scene Classification

Remote sensing (RS) image scene classification task faces many challenge...
research
09/02/2019

Dynamic Approach for Lane Detection using Google Street View and CNN

Lane detection algorithms have been the key enablers for a fully-assisti...

Please sign up or login with your details

Forgot password? Click here to reset