Partial FC: Training 10 Million Identities on a Single Machine

by   Xiang An, et al.

Face recognition has been an active and vital topic among computer vision community for a long time. Previous researches mainly focus on loss functions used for facial feature extraction network, among which the improvements of softmax-based loss functions greatly promote the performance of face recognition. However, the contradiction between the drastically increasing number of face identities and the shortage of GPU memories is gradually becoming irreconcilable. In this paper, we thoroughly analyze the optimization goal of softmax-based loss functions and the difficulty of training massive identities. We find that the importance of negative classes in softmax function in face representation learning is not as high as we previously thought. The experiment demonstrates no loss of accuracy when training with only 10% randomly sampled classes for the softmax-based loss functions, compared with training with full classes using state-of-the-art models on mainstream benchmarks. We also implement a very efficient distributed sampling algorithm, taking into account model accuracy and training efficiency, which uses only eight NVIDIA RTX2080Ti to complete classification tasks with tens of millions of identities. The code of this paper has been made available


page 1

page 2

page 3

page 4


Git Loss for Deep Face Recognition

Convolutional Neural Networks (CNNs) have been widely used in computer v...

Prototype Memory for Large-scale Face Representation Learning

Face representation learning using datasets with massive number of ident...

Minimum Margin Loss for Deep Face Recognition

Face recognition has achieved great progress owing to the fast developme...

Loss Functions for Top-k Error: Analysis and Insights

In order to push the performance on realistic computer vision tasks, the...

Killing Two Birds with One Stone:Efficient and Robust Training of Face Recognition CNNs by Partial FC

Learning discriminative deep feature embeddings by using million-scale i...

Analysis and Optimization of Loss Functions for Multiclass, Top-k, and Multilabel Classification

Top-k error is currently a popular performance measure on large scale im...

Face-NMS: A Core-set Selection Approach for Efficient Face Recognition

Recently, face recognition in the wild has achieved remarkable success a...

Code Repositories


A PyTorch implementation of the 'FaceNet' paper for training a facial recognition model with Triplet Loss using the glint360k dataset. A pre-trained model using Triplet Loss is available for download.

view repo