Machine learning model weights and activations are represented in
full-p...
Attention-based contextual biasing approaches have shown significant
imp...
We present dual-attention neural biasing, an architecture designed to bo...
For on-device automatic speech recognition (ASR), quantization aware tra...
The recurrent neural network transducer (RNN-T) is a prominent streaming...
We present a streaming, Transformer-based end-to-end automatic speech
re...
We present a novel sub-8-bit quantization-aware training (S8BQAT) scheme...
Personal rare word recognition in end-to-end Automatic Speech Recognitio...
Dialogue act classification (DAC) is a critical task for spoken language...
End-to-end (E2E) automatic speech recognition (ASR) systems often have
d...
Spoken language understanding (SLU) systems translate voice input comman...
Multi-channel inputs offer several advantages over single-channel, to im...
End-to-end (E2E) spoken language understanding (SLU) systems predict
utt...
We propose a simple yet effective method to compress an RNN-Transducer
(...
We present results from Alexa speech teams on semi-supervised learning (...
In order to achieve high accuracy for machine learning (ML) applications...
Transformers are powerful neural architectures that allow integrating
di...
End-to-end (E2E) spoken language understanding (SLU) systems can infer t...
Spoken language understanding (SLU) refers to the process of inferring t...
End-to-end spoken language understanding (SLU) models are a class of mod...
Multilingual ASR technology simplifies model training and deployment, bu...
Video Quality Assessment (VQA) is a very challenging task due to its hig...