As the model size of pre-trained language models (PLMs) grows rapidly, f...
With the continuous emergence of Chinese Large Language Models (LLMs), h...
For years the model performance in machine learning obeyed a power-law
r...
Existing multimodal conversation agents have shown impressive abilities ...
Recent advances in large-scale pre-training provide large models with th...
Pre-trained language models (PLMs) like BERT have made significant progr...
Structured pruning has been extensively studied on monolingual pre-train...
Despite recent progress in open-domain dialogue evaluation, how to devel...
This paper describes the submissions of the NiuTrans Team to the WNGT 20...
Improving Transformer efficiency has become increasingly attractive rece...
Encoder pre-training is promising in end-to-end Speech Translation (ST),...
The large attention-based encoder-decoder network (Transformer) has beco...
Unsupervised Bilingual Dictionary Induction methods based on the
initial...
Knowledge distillation has been proven to be effective in model accelera...
8-bit integer inference, as a promising direction in reducing both the
l...
Neural machine translation systems require a number of stacked layers fo...
Though early successes of Statistical Machine Translation (SMT) systems ...