A Causal And-Or Graph Model for Visibility Fluent Reasoning in Human-Object Interactions

by   Lei Qin, et al.

Tracking humans that are interacting with the other subjects or environment remains unsolved in visual tracking, because the visibility of the human of interests in videos is unknown and might vary over times. In particular, it is still difficult for state-of-the-art human trackers to recover complete human trajectories in crowded scenes with frequent human interactions. In this work, we consider the visibility status of a subject as a fluent variable, whose changes are mostly attributed to the subject's interactions with the surrounding, e.g., crossing behind another objects, entering a building, or getting into a vehicle, etc. We introduce a Causal And-Or Graph (C-AOG) to represent the causal-effect relations between an object's visibility fluents and its activities, and develop a probabilistic graph model to jointly reason the visibility fluent change (e.g., from visible to invisible) and track humans in videos. We formulate the above joint task as an iterative search of feasible causal graph structure that enables fast search algorithm, e.g., dynamic programming method. We apply the proposed method on challenging video sequences to evaluate its capabilities of estimating visibility fluent changes of subjects and tracking subjects of interests over time. Results with comparisons demonstrated that our method clearly outperforms the alternative trackers and can recover complete trajectories of humans in complicated scenarios with frequent human interactions.


page 1

page 7


Causal Discovery of Dynamic Models for Predicting Human Spatial Interactions

Exploiting robots for activities in human-shared environments, whether w...

An Algorithm for Limited Visibility Graph Searching

We study a graph search problem in which a team of searchers attempts to...

BEHAVE: Dataset and Method for Tracking Human Object Interactions

Modelling interactions between humans and objects in natural environment...

Search Tracker: Human-derived object tracking in-the-wild through large-scale search and retrieval

Humans use context and scene knowledge to easily localize moving objects...

Learning Human Activities and Object Affordances from RGB-D Videos

Understanding human activities and object affordances are two very impor...

Two is a crowd: tracking relations in videos

Tracking multiple objects individually differs from tracking groups of r...

Improving Human Annotation in Single Object Tracking

Human annotation is always considered as ground truth in video object tr...

Please sign up or login with your details

Forgot password? Click here to reset