Audio De-identification: A New Entity Recognition Task

03/17/2019
by   Ido Cohn, et al.
0

Named Entity Recognition (NER) has been mostly studied in the context of written text. Specifically, NER is an important step in de-identification (de-ID) of medical records, many of which are recorded conversations between a patient and a doctor. In such recordings, audio spans with personal information should be redacted, similar to the redaction of sensitive character spans in de-ID for written text. The application of NER in the context of audio de-identification has yet to be fully investigated. To this end, we define the task of audio de-ID, in which audio spans with entity mentions should be detected. We then present our pipeline for this task, which involves Automatic Speech Recognition (ASR), NER on the transcript text, and text-to-audio alignment. Finally, we introduce a novel metric for audio de-ID and a new evaluation benchmark consisting of a large labeled segment of the Switchboard and Fisher audio datasets and detail our pipeline's results on it.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/26/2022

Named Entity Recognition for Audio De-Identification

Data anonymization is often a task carried out by humans. Automating it ...
research
05/22/2020

End-to-end Named Entity Recognition from English Speech

Named entity recognition (NER) from text has been a widely studied probl...
research
03/16/2023

Trustera: A Live Conversation Redaction System

Trustera, the first functional system that redacts personally identifiab...
research
12/30/2009

Writer Identification Using Inexpensive Signal Processing Techniques

We propose to use novel and classical audio and text signal-processing a...
research
06/14/2023

Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation

Recently, end-to-end (E2E) automatic speech recognition (ASR) models hav...
research
02/07/2017

Fast and Accurate Entity Recognition with Iterated Dilated Convolutions

Today when many practitioners run basic NLP on the entire web and large-...

Please sign up or login with your details

Forgot password? Click here to reset