An Integrated Approach to Crowd Video Analysis: From Tracking to Multi-level Activity Recognition
We present an integrated framework for simultaneous tracking, group detection and multi-level activity recognition in crowd videos. Instead of solving these problems independently and sequentially, we solve them together in a unified framework to utilize the strong correlation that exists among individual motion, groups, and activities. We explore the hierarchical structure hidden in the video that connects individuals over time to produce tracks, connects individuals to form groups and also connects groups together to form a crowd. We show that estimation of this hidden structure corresponds to track association and group detection. We estimate this hidden structure under a linear programming formulation. The obtained graphical representation is further explored to recognize the node values that corresponds to multi-level activity recognition. This problem is solved under a structured SVM framework. The results on publicly available dataset show very competitive performance at all levels of granularity with the state-of-the-art batch processing methods despite the proposed technique being an online (causal) one.
READ FULL TEXT