It is well known that many machine learning systems demonstrate bias tow...
With 4.5 million hours of English speech from 10 different sources acros...
This paper improves the streaming transformer transducer for speech
reco...
Detection of common events and scenes from audio is useful for extractin...
Often, the storage and computational constraints of embeddeddevices dema...
We propose a dynamic encoder transducer (DET) for on-device speech
recog...
Pseudo-labeling is the most adopted method for pre-training automatic sp...
In this paper, we summarize the application of transformer and its strea...
Many semi- and weakly-supervised approaches have been investigated for
o...
We propose and evaluate transformer-based acoustic models (AMs) for hybr...