Recent advances in vision-and-language modeling have seen the developmen...
Attempts to computationally simulate the acquisition of spoken language ...
Image captioning models are usually evaluated on their ability to descri...
We present a detailed comparison of two types of sequence to sequence mo...