Active Speakers in Context

05/20/2020
by   Juan Leon Alcazar, et al.
0

Current methods for active speak er detection focus on modeling short-term audiovisual information from a single speaker. Although this strategy can be enough for addressing single-speaker scenarios, it prevents accurate detection when the task is to identify who of many candidate speakers are talking. This paper introduces the Active Speaker Context, a novel representation that models relationships between multiple speakers over long time horizons. Our Active Speaker Context is designed to learn pairwise and temporal relations from an structured ensemble of audio-visual observations. Our experiments show that a structured feature ensemble already benefits the active speaker detection performance. Moreover, we find that the proposed Active Speaker Context improves the state-of-the-art on the AVA-ActiveSpeaker dataset achieving a mAP of 87.1 consequence of our long-term multi-speaker analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 9

research
01/19/2023

LoCoNet: Long-Short Context Network for Active Speaker Detection

Active Speaker Detection (ASD) aims to identify who is speaking in each ...
research
06/07/2021

How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild

Successful active speaker detection requires a three-stage pipeline: (i)...
research
01/11/2021

MAAS: Multi-modal Assignation for Active Speaker Detection

Active speaker detection requires a solid integration of multi-modal cue...
research
08/05/2021

UniCon: Unified Context Network for Robust Active Speaker Detection

We introduce a new efficient framework, the Unified Context Network (Uni...
research
03/08/2023

A Light Weight Model for Active Speaker Detection

Active speaker detection is a challenging task in audio-visual scenario ...
research
06/13/2023

Speaker Verification Across Ages: Investigating Deep Speaker Embedding Sensitivity to Age Mismatch in Enrollment and Test Speech

In this paper, we study the impact of the ageing on modern deep speaker ...
research
05/23/2018

Modeling Interpersonal Influence of Verbal Behavior in Couples Therapy Dyadic Interactions

Dyadic interactions among humans are marked by speakers continuously inf...

Please sign up or login with your details

Forgot password? Click here to reset