Phonetic information and linguistic knowledge are an essential component...
Neural text-to-speech systems are often optimized on L1/L2 losses, which...
The Grapheme-to-Phoneme (G2P) task aims to convert orthographic input in...
Automatically predicting the outcome of subjective listening tests is a
...
The availability of data in expressive styles across languages is limite...
State-of-the-art text-to-speech (TTS) systems require several hours of
r...
We address the problem of cross-speaker style transfer for text-to-speec...
Ultrasound tongue imaging is used to visualise the intra-oral articulato...
We investigate multi-speaker speech recognition from ultrasound images o...
Speech sound disorders are a common communication impairment in childhoo...
We present the Tongue and Lips corpus (TaL), a multi-speaker corpus of a...
We introduce UltraSuite, a curated repository of ultrasound and acoustic...
We investigate the automatic processing of child speech therapy sessions...
Audiovisual synchronisation is the task of determining the time offset
b...
Ultrasound tongue imaging (UTI) provides a convenient way to visualize t...