In the field of music information retrieval (MIR), cover song identifica...
Direct speech-to-speech translation (S2ST) aims to convert speech from o...
The speech-to-singing (STS) voice conversion task aims to generate singi...
Speech Emotion Recognition (SER) is to recognize human emotions in a nat...
Direct speech-to-speech translation (S2ST) systems leverage recent progr...
Complex Knowledge Base Question Answering is a popular area of research ...
The target representation learned by convolutional neural networks plays...
We propose an end-to-end tracking framework for fusing the RGB and TIR
m...
Siamese approaches address the visual tracking problem by extracting an
...
The usage of both off-the-shelf and end-to-end trained deep networks hav...
Ensembles are a popular way to improve results of discriminative CNNs. T...