Existing language and vision models achieve impressive performance in
im...
Socially competent robots should be equipped with the ability to perceiv...
Building computer systems that can converse about their visual environme...
Zero-shot learning in Language & Vision is the task of correctly labelli...
We propose a new shared task for tactical data-to-text generation in the...
A common use of language is to refer to visually present objects. Modell...