Speech-driven animation has gained significant traction in recent years,...
We present RAVEn, a self-supervised multi-modal approach to jointly lear...
Audio-visual speech enhancement aims to extract clean speech from a nois...
Video-to-speech synthesis (also known as lip-to-speech) refers to the
tr...
One of the most pressing challenges for the detection of face-manipulate...
The large amount of audiovisual content being shared online today has dr...
Video-to-speech is the process of reconstructing the audio speech from a...