The answering quality of an aligned large language model (LLM) can be
dr...
This paper proposes a new method, OFA-OCR, to transfer multimodal pretra...
Generalist models, which are capable of performing diverse multi-modal t...
The tremendous success of CLIP (Radford et al., 2021) has promoted the
r...
Prompt tuning has become a new paradigm for model tuning and it has
demo...
Prompt Learning has recently gained great popularity in bridging the gap...
In this work, we pursue a unified paradigm for multimodal pretraining to...
Recent expeditious developments in deep learning algorithms, distributed...
Mixture-of-Experts (MoE) models can achieve promising results with outra...
Table-to-text generation refers to generating a descriptive text from a
...
Despite the achievements of large-scale multimodal pre-training approach...
In this work, we construct the largest dataset for multimodal pretrainin...
Multi-modal pretraining for learning high-level multi-modal representati...
Machine reading comprehension aims to teach machines to understand a tex...
Annotation corpus for discourse relations benefits NLP tasks such as mac...
Current evaluation metrics to question answering based machine reading
c...