
-
Personalization Strategies for End-to-End Speech Recognition Systems
The recognition of personalized content, such as contact names, remains ...
read it
-
Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding
Spoken language understanding (SLU) systems extract transcriptions, as w...
read it
-
REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling
Accents mismatching is a critical problem for end-to-end ASR. This paper...
read it
-
BW-EDA-EEND: Streaming End-to-End Neural Speaker Diarization for a Variable Number of Speakers
We present a novel online end-to-end neural diarization system, BW-EDA-E...
read it
-
DOVER-Lap: A Method for Combining Overlap-aware Diarization Outputs
Several advances have been made recently towards handling overlapping sp...
read it
-
Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition
In this work, we propose a novel and efficient minimum word error rate (...
read it
-
Improving Diarization Robustness using Diversification, Randomization and the DOVER Algorithm
Speaker diarization based on bottom-up clustering of speech segments by ...
read it
-
Combining Acoustics, Content and Interaction Features to Find Hot Spots in Meetings
Involvement hot spots have been proposed as a useful concept for meeting...
read it
-
DOVER: A Method for Combining Diarization Outputs
Speech recognition and other natural language tasks have long benefited ...
read it
-
Meeting Transcription Using Virtual Microphone Arrays
We describe a system that generates speaker-annotated transcripts of mee...
read it
-
Comparing Human and Machine Errors in Conversational Speech Transcription
Recent work in automatic recognition of conversational telephone speech ...
read it