Feature attribution methods have become a staple method to disentangle t...
Intelligent systems that aim at mastering language as humans do must dea...
Dialogue participants may have varying levels of knowledge about the top...
This work studies the semantic representations learned by BERT for compo...
When speakers describe an image, they tend to look at objects before
men...
Dialogue participants often refer to entities or situations repeatedly w...
This work aims at modeling how the meaning of gradable adjectives of siz...
We introduce MASSES, a simple evaluation metric for the task of Visual
Q...
We study the role of linguistic context in predicting quantifiers (`few'...
The present work investigates whether different quantification mechanism...
In this paper, we aim to understand whether current language and vision
...
Major advances have recently been made in merging language and vision
re...
People can refer to quantities in a visual scene by using either exact
c...
We introduce LAMBADA, a dataset to evaluate the capabilities of computat...