Many real-time applications (e.g., Augmented/Virtual Reality, cognitive
...
Keyword spotting and in particular Wake-Up-Word (WUW) detection is a ver...
Current end-to-end approaches to Spoken Language Translation (SLT) rely ...
Our interaction with the world is an inherently multimodal experience.
H...
End-to-end models for raw audio generation are a challenge, specially if...
Sounds are an important source of information on our daily interactions ...