Understanding Medical Conversations: Rich Transcription, Confidence Scores Information Extraction

04/06/2021
by   Hagen Soltau, et al.
0

In this paper, we describe novel components for extracting clinically relevant information from medical conversations which will be available as Google APIs. We describe a transformer-based Recurrent Neural Network Transducer (RNN-T) model tailored for long-form audio, which can produce rich transcriptions including speaker segmentation, speaker role labeling, punctuation and capitalization. On a representative test set, we compare performance of RNN-T models with different encoders, units and streaming constraints. Our transformer-based streaming model performs at about 20 the ASR task, 6 commas, 43 is paired with a confidence model that utilizes both acoustic and lexical features from the recognizer. The model performs at about 0.37 NCE. Finally, we describe a RNN-T based tagging model. The performance of the model depends on the ontologies, with F-scores of 0.90 for medications, 0.76 for symptoms, 0.75 for conditions, 0.76 for diagnosis, and 0.61 for treatments. While there is still room for improvement, our results suggest that these models are sufficiently accurate for practical applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2022

ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition

The recurrent neural network transducer (RNN-T) is a prominent streaming...
research
05/16/2020

Speech Recognition and Multi-Speaker Diarization of Long Conversations

Speech recognition (ASR) and speaker diarization (SD) models have tradit...
research
05/28/2018

Multimodal Speaker Segmentation and Diarization using Lexical and Acoustic Cues via Sequence to Sequence Neural Networks

While there has been substantial amount of work in speaker diarization r...
research
12/19/2021

Multi-turn RNN-T for streaming recognition of multi-party speech

Automatic speech recognition (ASR) of single channel far-field recording...
research
09/12/2017

Addressee and Response Selection in Multi-Party Conversations with Speaker Interaction RNNs

In this paper, we study the problem of addressee and response selection ...
research
06/29/2022

On the Prediction Network Architecture in RNN-T for ASR

RNN-T models have gained popularity in the literature and in commercial ...

Please sign up or login with your details

Forgot password? Click here to reset