Large-scale text-to-image diffusion models have shown impressive capabil...
Recent compositional zero-shot learning (CZSL) methods adapt pre-trained...
Many recent studies leverage the pre-trained CLIP for text-video cross-m...
Compositional zero-shot learning (CZSL) refers to recognizing unseen
com...
In this paper, we mainly focus on the problem of how to learn additional...
While few-shot learning (FSL) aims for rapid generalization to new conce...
The purpose of few-shot recognition is to recognize novel categories wit...