A framework for improving the performance of verification algorithms with a low false positive rate requirement and limited training data
In this paper we address the problem of matching patterns in the so-called verification setting in which a novel, query pattern is verified against a single training pattern: the decision sought is whether the two match (i.e. belong to the same class) or not. Unlike previous work which has universally focused on the development of more discriminative distance functions between patterns, here we consider the equally important and pervasive task of selecting a distance threshold which fits a particular operational requirement - specifically, the target false positive rate (FPR). First, we argue on theoretical grounds that a data-driven approach is inherently ill-conditioned when the desired FPR is low, because by the very nature of the challenge only a small portion of training data affects or is affected by the desired threshold. This leads us to propose a general, statistical model-based method instead. Our approach is based on the interpretation of an inter-pattern distance as implicitly defining a pattern embedding which approximately distributes patterns according to an isotropic multi-variate normal distribution in some space. This interpretation is then used to show that the distribution of training inter-pattern distances is the non-central chi2 distribution, differently parameterized for each class. Thus, to make the class-specific threshold choice we propose a novel analysis-by-synthesis iterative algorithm which estimates the three free parameters of the model (for each class) using task-specific constraints. The validity of the premises of our work and the effectiveness of the proposed method are demonstrated by applying the method to the task of set-based face verification on a large database of pseudo-random head motion videos.
READ FULL TEXT