Text-to-image (T2I) personalization allows users to guide the creative i...
Text-to-image models (T2I) offer a new level of flexibility by allowing ...
Text-to-image personalization aims to teach a pre-trained diffusion mode...
Text-to-image models offer unprecedented freedom to guide creation throu...
Large Vision Language models pretrained on web-scale data provide
re...
Reasoning and interacting with dynamic environments is a fundamental pro...
We study the problem of recognizing visual entities from the textual
des...
People easily recognize new visual categories that are new combinations ...
Learning to classify images with unbalanced class distributions is chall...
When describing images with natural language, the descriptions can be ma...
Learning with few samples is a major challenge for parameter-rich models...
Generalized zero-shot learning (GZSL) is the problem of learning a class...
In zero-shot learning (ZSL), a classifier is trained to recognize visual...
Recurrent neural networks have recently been used for learning to descri...