Model pre-training on large text corpora has been demonstrated effective...
Contrastive loss has been increasingly used in learning representations ...
Can we combine heterogenous graph structure with text to learn high-qual...
Recent research has shown that large language models pretrained using
un...
Aligning signals from different modalities is an important step in
visio...
Vision-language representation learning largely benefits from image-text...
Pre-training and then fine-tuning large language models is commonly used...
Vision-and-Language Pre-training (VLP) improves model performance for
do...