End-to-End Speech Recognition from Federated Acoustic Models

04/29/2021
by   Yan Gao, et al.
0

Training Automatic Speech Recognition (ASR) models under federated learning (FL) settings has recently attracted considerable attention. However, the FL scenarios often presented in the literature are artificial and fail to capture the complexity of real FL systems. In this paper, we construct a challenging and realistic ASR federated experimental setup consisting of clients with heterogeneous data distributions using the French Common Voice dataset, a large heterogeneous dataset containing over 10k speakers. We present the first empirical study on attention-based sequence-to-sequence E2E ASR model with three aggregation weighting strategies – standard FedAvg, loss-based aggregation and a novel word error rate (WER)-based aggregation, are conducted in two realistic FL scenarios: cross-silo with 10-clients and cross-device with 2k-clients. In particular, the WER-based weighting method is proposed to better adapt FL to the context of ASR by integrating the error rate metric with the aggregation process. Our analysis on E2E ASR from heterogeneous and realistic federated acoustic models provides the foundations for future research and development of realistic FL-based ASR applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2022

Decoupled Federated Learning for ASR with Non-IID Data

Automatic speech recognition (ASR) with federated learning (FL) makes it...
research
06/06/2022

FedNST: Federated Noisy Student Training for Automatic Speech Recognition

Federated Learning (FL) enables training state-of-the-art Automatic Spee...
research
11/06/2021

Privacy attacks for automatic speech recognition acoustic models in a federated learning framework

This paper investigates methods to effectively retrieve speaker informat...
research
03/23/2023

FS-Real: Towards Real-World Cross-Device Federated Learning

Federated Learning (FL) aims to train high-quality models in collaborati...
research
02/08/2021

Federated Acoustic Modeling For Automatic Speech Recognition

Data privacy and protection is a crucial issue for any automatic speech ...
research
09/28/2021

Private Language Model Adaptation for Speech Recognition

Speech model adaptation is crucial to handle the discrepancy between ser...
research
12/01/2020

Federated Marginal Personalization for ASR Rescoring

We introduce federated marginal personalization (FMP), a novel method fo...

Please sign up or login with your details

Forgot password? Click here to reset