Multimodal Shannon Game with Images

03/20/2023
by   Vilém Zouhar, et al.
0

The Shannon game has long been used as a thought experiment in linguistics and NLP, asking participants to guess the next letter in a sentence based on its preceding context. We extend the game by introducing an optional extra modality in the form of image information. To investigate the impact of multimodal information in this game, we use human participants and a language model (LM, GPT-2). We show that the addition of image information improves both self-reported confidence and accuracy for both humans and LM. Certain word classes, such as nouns and determiners, benefit more from the additional modality information. The priming effect in both humans and the LM becomes more apparent as the context size (extra modality information + sentence context) increases. These findings highlight the potential of multimodal information in improving language understanding and modeling.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 8

research
08/11/2023

Evidence of Human-Like Visual-Linguistic Integration in Multimodal Large Language Models During Predictive Language Processing

The advanced language processing abilities of large language models (LLM...
research
05/08/2023

MultiModal-GPT: A Vision and Language Model for Dialogue with Humans

We present a vision and language model named MultiModal-GPT to conduct m...
research
09/02/2021

Multimodal Conditionality for Natural Language Generation

Large scale pretrained language models have demonstrated state-of-the-ar...
research
11/24/2022

Robust-MSA: Understanding the Impact of Modality Noise on Multimodal Sentiment Analysis

Improving model robustness against potential modality noise, as an essen...
research
12/15/2022

Image-and-Language Understanding from Pixels Only

Multimodal models are becoming increasingly effective, in part due to un...
research
06/01/2018

Some of Them Can be Guessed! Exploring the Effect of Linguistic Context in Predicting Quantifiers

We study the role of linguistic context in predicting quantifiers (`few'...
research
12/08/2022

A Modality-level Explainable Framework for Misinformation Checking in Social Networks

The widespread of false information is a rising concern worldwide with c...

Please sign up or login with your details

Forgot password? Click here to reset