Collecting audio-text pairs is expensive; however, it is much easier to
...
Although frame-based models, such as CTC and transducers, have an affini...
There has been an increased interest in the integration of pretrained sp...
Recently there have been efforts to introduce new benchmark tasks for sp...
This paper describes our system for the low-resource domain adaptation t...
Disfluency detection has mainly been solved in a pipeline approach, as
p...
End-to-end automatic speech recognition suffers from adaptation to unkno...
Speech samples recorded in both indoor and outdoor environments are ofte...
A streaming style inference of encoder-decoder automatic speech recognit...
A deep neural network (DNN)-based speech enhancement (SE) aiming to maxi...
Although end-to-end automatic speech recognition (E2E ASR) has achieved ...
Self-attention (SA) based models have recently achieved significant
perf...
The Transformer self-attention network has recently shown promising
perf...
The Transformer self-attention network has recently shown promising
perf...
The Transformer self-attention network has recently shown promising
perf...
An on-device DNN-HMM speech recognition system efficiently works with a
...