AB/BA analysis: A framework for estimating keyword spotting recall improvement while maintaining audio privacy

04/18/2022
by   Raphael Petegrosso, et al.
0

Evaluation of keyword spotting (KWS) systems that detect keywords in speech is a challenging task under realistic privacy constraints. The KWS is designed to only collect data when the keyword is present, limiting the availability of hard samples that may contain false negatives, and preventing direct estimation of model recall from production data. Alternatively, complementary data collected from other sources may not be fully representative of the real application. In this work, we propose an evaluation technique which we call AB/BA analysis. Our framework evaluates a candidate KWS model B against a baseline model A, using cross-dataset offline decoding for relative recall estimation, without requiring negative examples. Moreover, we propose a formulation with assumptions that allow estimation of relative false positive rate between models with low variance even when the number of false positives is small. Finally, we propose to leverage machine-generated soft labels, in a technique we call Semi-Supervised AB/BA analysis, that improves the analysis time, privacy, and cost. Experiments with both simulation and real data show that AB/BA analysis is successful at measuring recall improvement in conjunction with the trade-off in relative false positive rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2018

Adversarial Attack Type I: Generating False Positives

False positive and false negative rates are equally important for evalua...
research
04/06/2023

To Wake-up or Not to Wake-up: Reducing Keyword False Alarm by Successive Refinement

Keyword spotting systems continuously process audio streams to detect ke...
research
10/07/2021

Ranking Warnings of Static Analysis Tools Using Representation Learning

Static analysis tools are frequently used to detect potential vulnerabil...
research
06/06/2022

Knowledge-based Document Classification with Shannon Entropy

Document classification is the detection specific content of interest in...
research
06/15/2022

Latency Control for Keyword Spotting

Conversational agents commonly utilize keyword spotting (KWS) to initiat...
research
07/06/2021

Furthering a Comprehensive SETI Bibliography

In 2019, Reyes Wright used the NASA Astrophysics Data System (ADS) t...
research
01/06/2021

Comparing Classification Models on Kepler Data

Even though the original Kepler mission ended due to mechanical failures...

Please sign up or login with your details

Forgot password? Click here to reset