Joint speech-language training is challenging due to the large demand fo...
We propose gated language experts to improve multilingual transformer
tr...
We introduce a language modeling approach for text to speech synthesis (...
Although speech is a simple and effective way for humans to communicate ...
End-to-end formulation of automatic speech recognition (ASR) and speech
...
Direct speech-to-speech translation (S2ST) is an attractive research top...
Self-supervised pre-training methods based on contrastive learning or
re...
The rapid development of single-modal pre-training has prompted research...
How to boost speech pre-training with textual data is an unsolved proble...
This paper describes the submission of our end-to-end YiTrans speech
tra...
Previous speech pre-training methods, such as wav2vec2.0 and HuBERT,
pre...
This paper studies a novel pre-training technique with unpaired speech d...
Self-supervised learning (SSL) achieves great success in speech recognit...
Multilingual automatic speech recognition (ASR) models have shown great
...
Evaluation metrics play a vital role in the growth of an area as it defi...
Pre-trained models for programming language have achieved dramatic empir...
Speech-to-text translation (ST), which translates source language speech...
The encoder-decoder framework has achieved promising process for many
se...
Existing approaches to neural machine translation (NMT) generate the tar...
In sequence to sequence generation tasks (e.g. machine translation and
a...
Current Neural Machine Translation (NMT) employs a language-specific enc...
Neural machine translation (NMT), a new approach to machine translation,...
The attention model has become a standard component in neural machine
tr...
Neural machine translation (NMT) becomes a new approach to machine
trans...