Multi-task Recurrent Model for Speech and Speaker Recognition

03/31/2016
by   Zhiyuan Tang, et al.
0

Although highly correlated, speech and speaker recognition have been regarded as two independent tasks and studied by two communities. This is certainly not the way that people behave: we decipher both speech content and speaker traits at the same time. This paper presents a unified model to perform speech and speaker recognition simultaneously and altogether. The model is based on a unified neural network where the output of one task is fed to the input of the other, leading to a multi-task recurrent network. Experiments show that the joint model outperforms the task-specific models on both the two tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2023

Multi-task learning of speech and speaker recognition

We study multi-task learning for two orthogonal speech technology tasks:...
research
10/07/2020

Domain Adversarial Neural Networks for Dysarthric Speech Recognition

Speech recognition systems have improved dramatically over the last few ...
research
01/26/2020

Multi-task Learning for Speaker Verification and Voice Trigger Detection

Automatic speech transcription and speaker recognition are usually treat...
research
02/27/2018

Deep factorization for speech signal

Various informative factors mixed in speech signals, leading to great di...
research
10/27/2020

Deep generative factorization for speech signal

Various information factors are blended in speech signals, which forms t...
research
04/07/2023

Interpretable Unified Language Checking

Despite recent concerns about undesirable behaviors generated by large l...
research
10/29/2019

On Investigation of Unsupervised Speech Factorization Based on Normalization Flow

Speech signals are complex composites of various information, including ...

Please sign up or login with your details

Forgot password? Click here to reset