Mapping two modalities, speech and text, into a shared representation sp...
Recently, Conformer based CTC/AED model has become a mainstream architec...
Mixture-of-experts based acoustic models with dynamic routing mechanisms...
This paper introduces GigaSpeech, an evolving, multi-domain English spee...
Recently, Mixture of Experts (MoE) based Transformer has shown promising...
Self-attention networks (SAN) have been introduced into automatic speech...
In many automatic speech recognition (ASR) tasks, an ideal model has to ...