An Integrated Framework for Two-pass Personalized Voice Trigger

06/30/2021
by   Dexin Liao, et al.
0

In this paper, we present the XMUSPEECH system for Task 1 of 2020 Personalized Voice Trigger Challenge (PVTC2020). Task 1 is a joint wake-up word detection with speaker verification on close talking data. The whole system consists of a keyword spotting (KWS) sub-system and a speaker verification (SV) sub-system. For the KWS system, we applied a Temporal Depthwise Separable Convolution Residual Network (TDSC-ResNet) to improve the system's performance. For the SV system, we proposed a multi-task learning network, where phonetic branch is trained with the character label of the utterance, and speaker branch is trained with the label of the speaker. Phonetic branch is optimized with connectionist temporal classification (CTC) loss, which is treated as an auxiliary module for speaker branch. Experiments show that our system gets significant improvements compared with baseline system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/26/2020

Multi-task Learning for Speaker Verification and Voice Trigger Detection

Automatic speech transcription and speaker recognition are usually treat...
research
02/26/2021

The NPU System for the 2020 Personalized Voice Trigger Challenge

This paper describes the system developed by the NPU team for the 2020 p...
research
05/08/2020

Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention

Keyword spotting (KWS) and speaker verification (SV) have been studied i...
research
03/31/2022

Learning Decoupling Features Through Orthogonality Regularization

Keyword spotting (KWS) and speaker verification (SV) are two important t...
research
08/17/2020

WSRNet: Joint Spotting and Recognition of Handwritten Words

In this work, we present a unified model that can handle both Keyword Sp...
research
09/05/2021

Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection

Many endeavors have sought to develop countermeasure techniques as enhan...
research
06/25/2019

Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)

This report describes our submission to the ActivityNet Challenge at CVP...

Please sign up or login with your details

Forgot password? Click here to reset