Identify Speakers in Cocktail Parties with End-to-End Attention

05/22/2020
by   Junzhe Zhu, et al.
0

In scenarios where multiple speakers talk at the same time, it is important to be able to identify the talkers accurately. This paper presents an end-to-end system that integrates speech source extraction and speaker identification, and proposes a new way to jointly optimize these two parts by max-pooling the speaker predictions along the channel dimension. Residual attention permits us to learn spectrogram masks that are optimized for the purpose of speaker identification, while residual forward connections permit dilated convolution with a sufficiently large context window to guarantee correct streaming across syllable boundaries. End-to-end training results in a system that recognizes one speaker in a two-speaker broadcast speech mixture with 99.9 all speakers in three-speaker scenarios with 81.2

READ FULL TEXT

page 2

page 3

research
05/20/2020

End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors

End-to-end speaker diarization for an unknown number of speakers is addr...
research
07/09/2019

Multi-Speaker End-to-End Speech Synthesis

In this work, we extend ClariNet (Ping et al., 2019), a fully end-to-end...
research
08/20/2020

Dyadic Speech-based Affect Recognition using DAMI-P2C Parent-child Multimodal Interaction Dataset

Automatic speech-based affect recognition of individuals in dyadic conve...
research
12/18/2020

End-to-End Speaker Diarization as Post-Processing

This paper investigates the utilization of an end-to-end diarization mod...
research
09/18/2019

RTTD-ID: Tracked Captions with Multiple Speakers for Deaf Students

Students who are deaf and hard of hearing cannot hear in class and do no...
research
12/01/2017

Speaker identification from the sound of the human breath

This paper examines the speaker identification potential of breath sound...
research
06/19/2019

Large-Scale Speaker Diarization of Radio Broadcast Archives

This paper describes our initial efforts to build a large-scale speaker ...

Please sign up or login with your details

Forgot password? Click here to reset