Voice spoofing attacks pose a significant threat to automated speaker
ve...
Conversational engagement estimation is posed as a regression problem,
e...
With the rapid growth of social media platforms, users are sharing billi...
In recent years, an association is established between faces and voices ...
Recent years have seen an increased interest in establishing association...
The use of attention models for automated image captioning has enabled m...
Zero-shot learning methods rely on fixed visual and semantic embeddings,...
We study the problem of learning association between face and voice, whi...
Recent years have seen a surge in finding association between faces and
...
We propose a novel deep training algorithm for joint representation of a...
Visualization refers to our ability to create an image in our head based...
Current cross-modal retrieval systems are evaluated using R@K measure wh...
With massive explosion of social media such as Twitter and Instagram, pe...
Majority of the current dimensionality reduction or retrieval techniques...
Multi-modal approaches employ data from multiple input streams such as
t...
The question we answer with this work is: can we convert a text document...
Convolutional Neural Networks (CNNs) have been widely used in computer v...
This paper proposes a cross-modal retrieval system that leverages on ima...