APES: Audiovisual Person Search in Untrimmed Video

06/03/2021
by   Juan Leon Alcazar, et al.
0

Humans are arguably one of the most important subjects in video streams, many real-world applications such as video summarization or video editing workflows often require the automatic search and retrieval of a person of interest. Despite tremendous efforts in the person reidentification and retrieval domains, few works have developed audiovisual search strategies. In this paper, we present the Audiovisual Person Search dataset (APES), a new dataset composed of untrimmed videos whose audio (voices) and visual (faces) streams are densely annotated. APES contains over 1.9K identities labeled along 36 hours of video, making it the largest dataset available for untrimmed audiovisual person search. A key property of APES is that it includes dense temporal annotations that link faces to speech segments of the same identity. To showcase the potential of our new dataset, we propose an audiovisual baseline and benchmark for person retrieval. Our study shows that modeling audiovisual cues benefits the recognition of people's identities. To enable reproducibility and promote future research, the dataset annotations and baseline code are available at: https://github.com/fuankarion/audiovisual-person-search

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 10

research
07/27/2018

Person Search in Videos with One Portrait Through Visual and Temporal Links

In real-world applications, e.g. law enforcement and video retrieval, on...
research
04/25/2018

Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos

In Actor and Observer we introduced a dataset linking the first and thir...
research
08/08/2020

Online Multi-modal Person Search in Videos

The task of searching certain people in videos has seen increasing poten...
research
07/14/2023

TVPR: Text-to-Video Person Retrieval and a New Benchmark

Most existing methods for text-based person retrieval focus on text-to-i...
research
09/23/2022

Marine Video Kit: A New Marine Video Dataset for Content-based Analysis and Retrieval

Effective analysis of unusual domain specific video collections represen...
research
06/01/2023

MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding

Monitoring animal behavior can facilitate conservation efforts by provid...
research
04/06/2018

Deep Person Detection in 2D Range Data

Detecting humans is a key skill for mobile robots and intelligent vehicl...

Please sign up or login with your details

Forgot password? Click here to reset