Egocentric Meets Top-view
Thanks to the availability and increasing popularity of Egocentric cameras such as GoPro cameras, glasses, and etc. we have been provided with a plethora of videos captured from the first person perspective. Surveillance cameras and Unmanned Aerial Vehicles(also known as drones) also offer tremendous amount of videos, mostly with top-down or oblique view-point. Egocentric vision and top-view surveillance videos have been studied extensively in the past in the computer vision community. However, the relationship between the two has yet to be explored thoroughly. In this effort, we attempt to explore this relationship by approaching two questions. First, having a set of egocentric videos and a top-view video, can we verify if the top-view video contains all, or some of the egocentric viewers present in the egocentric set? And second, can we identify the egocentric viewers in the content of the top-view video? In other words, can we find the cameramen in the surveillance videos? These problems can become more challenging when the videos are not time-synchronous. Thus we formalize the problem in a way which handles and also estimates the unknown relative time-delays between the egocentric videos and the top-view video. We formulate the problem as a spectral graph matching instance, and jointly seek the optimal assignments and relative time-delays of the videos. As a result, we spatiotemporally localize the egocentric observers in the top-view video. We model each view (egocentric or top) using a graph, and compute the assignment and time-delays in an iterative-alternative fashion.
READ FULL TEXT