Risk of re-identification for shared clinical speech recordings

10/18/2022
by   Daniela A. Wiepert, et al.
0

Large, curated datasets are required to leverage speech-based tools in healthcare. These are costly to produce, resulting in increased interest in data sharing. As speech can potentially identify speakers (i.e., voiceprints), sharing recordings raises privacy concerns. We examine the re-identification risk for speech recordings, without reference to demographic or metadata, using a state-of-the-art speaker recognition system. We demonstrate that the risk is inversely related to the number of comparisons an adversary must consider, i.e., the search space. Risk is high for a small search space but drops as the search space grows (precision >0.85 for <1*10^6 comparisons, precision <0.5 for >3*10^6 comparisons). Next, we show that the nature of a speech recording influences re-identification risk, with non-connected speech (e.g., vowel prolongation) being harder to identify. Our findings suggest that speaker recognition systems can be used to re-identify participants in specific circumstances, but in practice, the re-identification risk appears low.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/25/2017

Leveraging Native Language Speech for Accent Identification using Deep Siamese Networks

The problem of automatic accent identification is important for several ...
research
04/05/2021

Streaming Multi-talker Speech Recognition with Joint Speaker Identification

In multi-talker scenarios such as meetings and conversations, speech pro...
research
06/22/2018

Weakly Supervised Training of Speaker Identification Models

We propose an approach for training speaker identification models in a w...
research
11/11/2020

Low-resource expressive text-to-speech using data augmentation

While recent neural text-to-speech (TTS) systems perform remarkably well...
research
01/09/2022

Emotional Speaker Identification using a Novel Capsule Nets Model

Speaker recognition systems are widely used in various applications to i...
research
05/30/2023

Investigating model performance in language identification: beyond simple error statistics

Language development experts need tools that can automatically identify ...
research
04/02/2020

Improving auditory attention decoding performance of linear and non-linear methods using state-space model

Identifying the target speaker in hearing aid applications is crucial to...

Please sign up or login with your details

Forgot password? Click here to reset