Learning Triadic Belief Dynamics in Nonverbal Communication from Videos

04/07/2021
by   Lifeng Fan, et al.
0

Humans possess a unique social cognition capability; nonverbal communication can convey rich social information among agents. In contrast, such crucial social characteristics are mostly missing in the existing scene understanding literature. In this paper, we incorporate different nonverbal communication cues (e.g., gaze, human poses, and gestures) to represent, model, learn, and infer agents' mental states from pure visual inputs. Crucially, such a mental representation takes the agent's belief into account so that it represents what the true world state is and infers the beliefs in each agent's mental state, which may differ from the true world states. By aggregating different beliefs and true world states, our model essentially forms "five minds" during the interactions between two agents. This "five minds" model differs from prior works that infer beliefs in an infinite recursion; instead, agents' beliefs are converged into a "common mind". Based on this representation, we further devise a hierarchical energy-based model that jointly tracks and predicts all five minds. From this new perspective, a social event is interpreted by a series of nonverbal communication and belief dynamics, which transcends the classic keyframe video summary. In the experiments, we demonstrate that using such a social account provides a better video summary on videos with rich social interactions compared with state-of-the-art keyframe video summary methods.

READ FULL TEXT

page 1

page 2

page 3

page 5

page 6

page 7

page 9

page 10

research
10/17/2022

Robot Learning Theory of Mind through Self-Observation: Exploiting the Intentions-Beliefs Synergy

In complex environments, where the human sensory system reaches its limi...
research
06/21/2022

BOSS: A Benchmark for Human Belief Prediction in Object-context Scenarios

Humans with an average level of social cognition can infer the beliefs o...
research
02/04/2021

The Wisdom of the Crowd and Higher-Order Beliefs

The classic wisdom-of-the-crowd problem asks how a principal can "aggreg...
research
01/17/2023

Memory-Augmented Theory of Mind Network

Social reasoning necessitates the capacity of theory of mind (ToM), the ...
research
06/27/2023

MindDial: Belief Dynamics Tracking with Theory-of-Mind Modeling for Situated Neural Dialogue Generation

Humans talk in free-form while negotiating the expressed meanings or com...
research
03/11/2002

Representing and Aggregating Conflicting Beliefs

We consider the two-fold problem of representing collective beliefs and ...
research
04/29/2019

Local non-Bayesian social learning with stubborn agents

In recent years, people have increasingly turned to social networks like...

Please sign up or login with your details

Forgot password? Click here to reset