Text-to-image diffusion models are now capable of generating images that...
There has been a recent explosion of impressive generative models that c...
Stylized image captioning as presented in prior work aims to generate
ca...
Autoregressive models are a class of exact inference approaches with hig...
Diverse image captioning models aim to learn one-to-many mappings that a...
Flow-based generative models are an important class of exact inference m...
Learned joint representations of images and text form the backbone of se...
One of the key challenges in learning joint embeddings of multiple
modal...