The One Where They Reconstructed 3D Humans and Environments in TV Shows

07/28/2022
by   Georgios Pavlakos, et al.
0

TV shows depict a wide variety of human behaviors and have been studied extensively for their potential to be a rich source of data for many applications. However, the majority of the existing work focuses on 2D recognition tasks. In this paper, we make the observation that there is a certain persistence in TV shows, i.e., repetition of the environments and the humans, which makes possible the 3D reconstruction of this content. Building on this insight, we propose an automatic approach that operates on an entire season of a TV show and aggregates information in 3D; we build a 3D model of the environment, compute camera information, static 3D scene structure and body scale information. Then, we demonstrate how this information acts as rich 3D context that can guide and improve the recovery of 3D human pose and position in these environments. Moreover, we show that reasoning about humans and their environment in 3D enables a broad range of downstream applications: re-identification, gaze estimation, cinematography and image editing. We apply our approach on environments from seven iconic TV shows and perform an extensive evaluation of the proposed system.

READ FULL TEXT

page 1

page 4

page 6

page 7

page 8

page 10

page 11

page 14

research
03/13/2019

Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments

Affordance modeling plays an important role in visual understanding. In ...
research
04/05/2022

Depth-Guided Sparse Structure-from-Motion for Movies and TV Shows

Existing approaches for Structure from Motion (SfM) produce impressive 3...
research
10/17/2022

Temporal and Contextual Transformer for Multi-Camera Editing of TV Shows

The ability to choose an appropriate camera view among multiple cameras ...
research
10/04/2016

Adaptive Graph-based Total Variation for Tomographic Reconstructions

Sparsity exploiting image reconstruction (SER) methods have been extensi...
research
12/13/2021

Hallucinating Pose-Compatible Scenes

What does human pose tell us about a scene? We propose a task to answer ...
research
06/01/2020

Implementing AI-powered semantic character recognition in motor racing sports

Oftentimes TV producers of motor-racing programs overlay visual and text...
research
09/14/2012

Detection and Classification of Viewer Age Range Smart Signs at TV Broadcast

In this paper, the identification and classification of Viewer Age Range...

Please sign up or login with your details

Forgot password? Click here to reset