Large-scale generative models such as GPT and DALL-E have revolutionized...
Most people who have tried to learn a foreign language would have experi...
Text-based voice editing (TBVE) uses synthetic output from text-to-speec...
Hybrid automatic speech recognition (ASR) models are typically sequentia...
In this paper, we introduce the Kaizen framework that uses a continuousl...
Many semi- and weakly-supervised approaches have been investigated for
o...
We describe the system our team used during NIST's LoReHLT (Low Resource...
Speech recognition systems for irregularly-spelled languages like Englis...
The paper summarizes the development of the LVCSR system built as a part...