Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)

06/25/2019
by   Joon Son Chung, et al.
2

This report describes our submission to the ActivityNet Challenge at CVPR 2019. We use a 3D convolutional neural network (CNN) based front-end and an ensemble of temporal convolution and LSTM classifiers to predict whether a visible person is speaking or not. Our results show significant improvements over the baseline on the AVA-ActiveSpeaker dataset.

READ FULL TEXT

page 1

page 2

research
09/29/2020

Clova Baseline System for the VoxCeleb Speaker Recognition Challenge 2020

This report describes our submission to the VoxCeleb Speaker Recognition...
research
10/21/2020

The UPC Speaker Verification System Submitted to VoxCeleb Speaker Recognition Challenge 2020 (VoxSRC-20)

This report describes the submission from Technical University of Catalo...
research
06/26/2017

VoxCeleb: a large-scale speaker identification dataset

Most existing datasets for speaker identification contain samples obtain...
research
10/23/2020

EML System Description for VoxCeleb Speaker Diarization Challenge 2020

This technical report describes the EML submission to the first VoxCeleb...
research
11/15/2016

CIFAR-10: KNN-based Ensemble of Classifiers

In this paper, we study the performance of different classifiers on the ...
research
06/30/2021

An Integrated Framework for Two-pass Personalized Voice Trigger

In this paper, we present the XMUSPEECH system for Task 1 of 2020 Person...
research
06/22/2022

UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022

This report presents a brief description of our winning solution to the ...

Please sign up or login with your details

Forgot password? Click here to reset