Andreas Stolcke

research

∙ 06/02/2023

Streaming Speech-to-Confusion Network Speech Recognition

In interactive automatic speech recognition (ASR) systems, low-latency r...

2 Denis Filimonov, et al. ∙

research

∙ 03/30/2023

PROCTER: PROnunciation-aware ConTextual adaptER for personalized speech recognition in neural transducers

End-to-End (E2E) automatic speech recognition (ASR) systems used in voic...

0 Rahul Pandey, et al. ∙

research

∙ 03/23/2023

Adaptive Endpointing with Deep Contextual Multi-armed Bandits

Current endpointing (EP) solutions learn in a supervised framework, whic...

0 Do June Min, et al. ∙

research

∙ 11/04/2022

Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech

Stuttering is a speech disorder where the natural flow of speech is inte...

0 Xin Zhang, et al. ∙

research

∙ 10/11/2022

An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition

Differential privacy (DP) is one data protection avenue to safeguard use...

0 Chao-Han Huck Yang, et al. ∙

research

∙ 07/22/2022

Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities

As for other forms of AI, speech recognition has recently been examined ...

4 Pranav Dheram, et al. ∙

research

∙ 07/16/2022

Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation

We present an approach to reduce the performance disparity between geogr...

0 Viet Anh Trinh, et al. ∙

research

∙ 07/15/2022

Adversarial Reweighting for Speaker Verification Fairness

We address performance fairness for speaker verification using the adver...

0 Minho Jin, et al. ∙

research

∙ 07/08/2022

Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification

Speaker identification (SID) in the household scenario (e.g., for smart ...

0 Long Chen, et al. ∙

research

∙ 03/16/2022

CUE Vectors: Modular Training of Language Models Conditioned on Diverse Contextual Signals

We propose a framework to modularize the training of neural language mod...

0 Scott Novotney, et al. ∙

research

∙ 02/23/2022

Improving fairness in speaker verification via Group-adapted Fusion Network

Modern speaker verification models use deep neural networks to encode ut...

0 Hua Shen, et al. ∙

research

∙ 02/22/2022

Contrastive-mixup learning for improved speaker verification

This paper proposes a novel formulation of prototypical loss with mixup ...

0 Xin Zhang, et al. ∙

research

∙ 02/17/2022

Mitigating Closed-model Adversarial Examples with Bayesian Neural Modeling for Enhanced End-to-End Speech Recognition

In this work, we aim to enhance the system robustness of end-to-end auto...

1 Chao-Han Huck Yang, et al. ∙

research

∙ 02/07/2022

Self-supervised Speaker Recognition Training Using Human-Machine Dialogues

Speaker recognition, recognizing speaker identities based on voice alone...

0 Metehan Cekic, et al. ∙

research

∙ 02/02/2022

ASR-Aware End-to-end Neural Diarization

We present a Conformer-based end-to-end neural diarization (EEND) model ...

0 Aparna Khare, et al. ∙

research

∙ 02/02/2022

RescoreBERT: Discriminative Speech Recognition Rescoring with BERT

Second-pass rescoring is an important component in automatic speech reco...

0 Liyan Xu, et al. ∙

research

∙ 09/06/2021

Improving Speaker Identification for Shared Devices by Adapting Embeddings to Speaker Subsets

Speaker identification typically involves three stages. First, a front-e...

0 Zhenning Tan, et al. ∙

research

∙ 06/18/2021

Fusion of Embeddings Networks for Robust Combination of Text Dependent and Independent Speaker Recognition

By implicitly recognizing a user based on his/her speech input, speaker ...

0 Ruirui Li, et al. ∙

research

∙ 06/15/2021

Graph-based Label Propagation for Semi-Supervised Speaker Identification

Speaker identification in the household scenario (e.g., for smart speake...

0 Long Chen, et al. ∙

research

∙ 06/14/2021

End-to-end Neural Diarization: From Transformer to Conformer

We propose a new end-to-end neural diarization (EEND) system that is bas...

0 Yi-Chieh Liu, et al. ∙

research

∙ 06/02/2021

Attention-based Contextual Language Model Adaptation for Speech Recognition

Language modeling (LM) for automatic speech recognition (ASR) does not u...

30 Richard Diehl Martinez, et al. ∙

research

∙ 05/14/2021

Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End

Comprehending the overall intent of an utterance helps a listener recogn...

16 Swayambhu Nath Ray, et al. ∙

research

∙ 04/25/2021

Reranking Machine Translation Hypotheses with Structured and Web-based Language Models

In this paper, we investigate the use of linguistically motivated and co...

12 Wen Wang, et al. ∙

research

∙ 03/09/2021

Wav2vec-C: A Self-supervised Model for Speech Representation Learning

Wav2vec-C introduces a novel representation learning technique combining...

0 Samik Sadhu, et al. ∙

research

∙ 02/15/2021

Personalization Strategies for End-to-End Speech Recognition Systems

The recognition of personalized content, such as contact names, remains ...

0 Aditya Gourav, et al. ∙

research

∙ 02/12/2021

Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding

Spoken language understanding (SLU) systems extract transcriptions, as w...

0 Milind Rao, et al. ∙

research

∙ 12/14/2020

REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling

Accents mismatching is a critical problem for end-to-end ASR. This paper...

0 Hu Hu, et al. ∙

research

∙ 11/05/2020

BW-EDA-EEND: Streaming End-to-End Neural Speaker Diarization for a Variable Number of Speakers

We present a novel online end-to-end neural diarization system, BW-EDA-E...

0 Eunjung Han, et al. ∙

research

∙ 11/03/2020

DOVER-Lap: A Method for Combining Overlap-aware Diarization Outputs

Several advances have been made recently towards handling overlapping sp...

0 Desh Raj, et al. ∙

research

∙ 07/27/2020

Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition

In this work, we propose a novel and efficient minimum word error rate (...

0 Jinxi Guo, et al. ∙

research

∙ 10/24/2019

Improving Diarization Robustness using Diversification, Randomization and the DOVER Algorithm

Speaker diarization based on bottom-up clustering of speech segments by ...

0 Andreas Stolcke, et al. ∙

research

∙ 10/24/2019

Combining Acoustics, Content and Interaction Features to Find Hot Spots in Meetings

Involvement hot spots have been proposed as a useful concept for meeting...

0 Dave Makhervaks, et al. ∙

research

∙ 09/17/2019

DOVER: A Method for Combining Diarization Outputs

Speech recognition and other natural language tasks have long benefited ...

0 Andreas Stolcke, et al. ∙

research

∙ 05/03/2019

Meeting Transcription Using Virtual Microphone Arrays

We describe a system that generates speaker-annotated transcripts of mee...

0 Takuya Yoshioka, et al. ∙

research

∙ 08/29/2017

Comparing Human and Machine Errors in Conversational Speech Transcription

Recent work in automatic recognition of conversational telephone speech ...

0 Andreas Stolcke, et al. ∙

Andreas Stolcke

Featured Co-authors

Sign in with Google

Consider DeepAI Pro