The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis

03/04/2023
by   Haoxu Wang, et al.
0

This paper further explores our previous wake word spotting system ranked 2-nd in Track 1 of the MISP Challenge 2021. First, we investigate a robust unimodal approach based on 3D and 2D convolution and adopt the simple attention module (SimAM) for our system to improve performance. Second, we explore different combinations of data augmentation methods for better performance. Finally, we study the fusion strategies, including score-level, cascaded and neural fusion. Our proposed multimodal system leverages multimodal features and uses the complementary visual information to mitigate the performance degradation of audio-only systems in complex acoustic scenarios. Our system obtains a false reject rate of 2.15 evaluation set of the competition database, which achieves the new state-of-the-art performance by 21 systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2022

Audio-Visual Wake Word Spotting System For MISP Challenge 2021

This paper presents the details of our system designed for the Task 1 of...
research
03/15/2023

Autonomous Soundscape Augmentation with Multimodal Fusion of Visual and Participant-linked Inputs

Autonomous soundscape augmentation systems typically use trained models ...
research
10/13/2022

Deepfake Detection System for the ADD Challenge Track 3.2 Based on Score Fusion

This paper describes the deepfake audio detection system submitted to th...
research
10/20/2020

Tongji University Undergraduate Team for the VoxCeleb Speaker Recognition Challenge2020

In this report, we discribe the submission of Tongji University undergra...
research
04/27/2022

Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations

Multimodal speech recognition aims to improve the performance of automat...
research
10/16/2021

Hybrid Mutimodal Fusion for Dimensional Emotion Recognition

In this paper, we extensively present our solutions for the MuSe-Stress ...
research
04/21/2022

The 2021 NIST Speaker Recognition Evaluation

The 2021 Speaker Recognition Evaluation (SRE21) was the latest cycle of ...

Please sign up or login with your details

Forgot password? Click here to reset