In this project, we implemented an end-to-end system that takes in combi...
We investigate the role of various demonstration components in the in-co...
Despite the promising progress in multi-modal tasks, current large
multi...
Vision-language pretraining models have achieved great success in suppor...
We introduce a new benchmark, COVID-VTS, for fact-checking multi-modal
i...
In this paper we propose VisualNews-Captioner, an entity-aware model for...