This paper describes the NPU-MSXF system for the IWSLT 2023 speech-to-sp...
Direct speech-to-speech translation (S2ST) has gradually become popular ...
Due to the scarcity of available data, deep learning does not perform we...
Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a challenging cross-...
This paper aims to synthesize target speaker's speech with desired speak...
Recent development of neural vocoders based on the generative adversaria...
In current two-stage neural text-to-speech (TTS) paradigm, it is ideal t...
Speaker adaptation in text-to-speech synthesis (TTS) is to finetune a
pr...
In this paper, we reveal that metric learning would suffer from serious
...
K-NN classifier is one of the most famous classification algorithms, who...