End-to-end speech recognition modeling from de-identified data

07/12/2022
by   Martin Flechl, et al.
0

De-identification of data used for automatic speech recognition modeling is a critical component in protecting privacy, especially in the medical domain. However, simply removing all personally identifiable information (PII) from end-to-end model training data leads to a significant performance degradation in particular for the recognition of names, dates, locations, and words from similar categories. We propose and evaluate a two-step method for partially recovering this loss. First, PII is identified, and each occurrence is replaced with a random word sequence of the same category. Then, corresponding audio is produced via text-to-speech or by splicing together matching audio fragments extracted from the corpus. These artificial audio/label pairs, together with speaker turns from the original data without PII, are used to train models. We evaluate the performance of this method on in-house data of medical conversations and observe a recovery of almost the entire performance degradation in the general word error rate while still maintaining a strong diarization performance. Our main focus is the improvement of recall and precision in the recognition of PII-related words. Depending on the PII category, between 50% - 90% of the performance degradation can be recovered using our proposed method.

READ FULL TEXT
research
11/23/2020

Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems

Today, many state-of-the-art automatic speech recognition (ASR) systems ...
research
02/19/2019

A spelling correction model for end-to-end speech recognition

Attention-based sequence-to-sequence models for speech recognition joint...
research
10/06/2022

Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition

Automatic speech recognition models are often adapted to improve their a...
research
02/17/2022

Curriculum optimization for low-resource speech recognition

Modern end-to-end speech recognition models show astonishing results in ...
research
06/21/2019

Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models

Contextual automatic speech recognition, i.e., biasing recognition towar...
research
12/14/2019

Personalization of End-to-end Speech Recognition On Mobile Devices For Named Entities

We study the effectiveness of several techniques to personalize end-to-e...
research
08/14/2023

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Accurate recognition of specific categories, such as persons' names, dat...

Please sign up or login with your details

Forgot password? Click here to reset