Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification

09/09/2017
by   Chong Wang, et al.
0

Knowledge distillation is a potential solution for model compression. The idea is to make a small student network imitate the target of a large teacher network, then the student network can be competitive to the teacher one. Most previous studies focus on model distillation in the classification task, where they propose different architects and initializations for the student network. However, only the classification task is not enough, and other related tasks such as regression and retrieval are barely considered. To solve the problem, in this paper, we take face recognition as a breaking point and propose model distillation with knowledge transfer from face classification to alignment and verification. By selecting appropriate initializations and targets in the knowledge transfer, the distillation can be easier in non-classification tasks. Experiments on the CelebA and CASIA-WebFace datasets demonstrate that the student network can be competitive to the teacher one in alignment and verification, and even surpasses the teacher network under specific compression rates. In addition, to achieve stronger knowledge transfer, we also use a common initialization trick to improve the distillation performance of classification. Evaluations on the CASIA-Webface and large-scale MS-Celeb-1M datasets show the effectiveness of this simple trick.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/12/2022

CoupleFace: Relation Matters for Face Recognition Distillation

Knowledge distillation is an effective method to improve the performance...
research
06/26/2023

Cross Architecture Distillation for Face Recognition

Transformers have emerged as the superior choice for face recognition ta...
research
05/16/2023

Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation

It has been commonly observed that a teacher model with superior perform...
research
12/26/2022

Prototype-guided Cross-task Knowledge Distillation for Large-scale Models

Recently, large-scale pre-trained models have shown their advantages in ...
research
12/17/2021

Distill and De-bias: Mitigating Bias in Face Recognition using Knowledge Distillation

Face recognition networks generally demonstrate bias with respect to sen...
research
08/08/2023

Teacher-Student Architecture for Knowledge Distillation: A Survey

Although Deep neural networks (DNNs) have shown a strong capacity to sol...
research
06/25/2023

Feature Adversarial Distillation for Point Cloud Classification

Due to the point cloud's irregular and unordered geometry structure, con...

Please sign up or login with your details

Forgot password? Click here to reset