Self-supervised pre-trained models such as Wav2vec2, Hubert, and WavLM h...
Multilingual self-supervised speech representation models have greatly
e...
Automatic detection of machine anomaly remains challenging for machine
l...
We present our submission to the ICASSP-SPGC-2023 ADReSS-M Challenge Tas...
The widespread emergence of smart devices for ECG has sparked demand for...
It is in high demand to generate facial animation with high realism, but...
Self-supervised learning (SSL) has achieved great success in various are...
Although the security of automatic speaker verification (ASV) is serious...
Adversarial attack approaches to speaker identification either need high...
Recent years have witnessed great strides in self-supervised learning (S...
Labeled audio data is insufficient to build satisfying speech recognitio...
Code-switching automatic speech recognition becomes one of the most
chal...
This paper describes the THUEE team's speech recognition system for the ...
Language identification is a task of automatically determining the ident...
As the cornerstone of other important technologies, such as speech
recog...
While recent text to speech (TTS) models perform very well in synthesizi...
Rap generation, which aims to produce lyrics and corresponding singing b...
This paper introduces GigaSpeech, an evolving, multi-domain English spee...
This paper describes the systems submitted by the department of electron...
Sound event detection with weakly labeled data is considered as a proble...
In this paper, we propose a temporal-frequential attention model for sou...
In this paper, we propose a method for home activity monitoring. We
demo...