Previous studies have typically assumed that large language models are u...
Vision Transformers have achieved great success in computer visions,
del...
Multilingual self-supervised speech representation models have greatly
e...
Text irregularities pose significant challenges to scene text recognizer...
With the development of deep generative models, recent years have seen g...
The task of referring video object segmentation aims to segment the obje...
Referring image segmentation aims to segment the target object described...
Prompt tuning has been employed as an efficient way to adapt large
visio...
Recent development of neural vocoders based on the generative adversaria...
This paper describes the approach we have taken in the challenge. We sti...
Code-switching automatic speech recognition becomes one of the most
chal...
One-shot generative domain adaption aims to transfer a pre-trained gener...
This paper introduces a new corpus of Mandarin-English code-switching sp...
Language identification is a task of automatically determining the ident...