Pronunciation assessment is a major challenge in the computer-aided
pron...
In this paper, we propose DiffusionNER, which formulates the named entit...
Solving complicated AI tasks with different domains and modalities is a ...
As a common way of emotion signaling via non-linguistic vocalizations, v...
With the global population aging rapidly, Alzheimer's disease (AD) is
pa...
Prompt learning is one of the most effective and trending ways to adapt
...
Dialogue summarization aims to condense the lengthy dialogue into a conc...
Autoregressive generative models are commonly used, especially for those...
Sentence scoring aims at measuring the likelihood score of a sentence an...
In the development of neural text-to-speech systems, model pre-training ...
Weight sharing has become the de facto approach to reduce the
training c...
Rap generation, which aims to produce lyrics and corresponding singing b...
While pre-trained language models (e.g., BERT) have achieved impressive
...
Automatic song writing aims to compose a song (lyric and/or melody) by
m...
Neural machine translation (NMT) generates the next target token given a...
While pre-training and fine-tuning, e.g., BERT <cit.>,
GPT-2 <cit.>, hav...
BERT adopts masked language modeling (MLM) for pre-training and is one o...
Pre-training and fine-tuning, e.g., BERT, have achieved great success in...
Recently, deep neural networks have significant progress and successful
...
The encoder-decoder is the typical framework for Neural Machine Translat...
Encoder-decoder based Sequence to Sequence learning (S2S) has made remar...