Language Aided Speaker Diarization Using Speaker Role Information

11/18/2019
by   Nikolaos Flemotomos, et al.
0

Speaker diarization relies on the assumption that acoustic embeddings from speech segments corresponding to a particular speaker share common characteristics. Thus, they are concentrated in a specific region of the speaker space; a region which represents that speaker's identity. Those identities however are not known a priori, so a clustering algorithm is employed, which is typically based solely on audio. In this work we explore conversational scenarios where the speakers play distinct roles and are expected to follow different linguistic patterns. We aim to exploit this distinct linguistic variability and build a language-based segmenter and a role recognizer which can be used to construct the speaker identities. That way, we are able to boost the diarization performance by converting the clustering task to a classification one. The proposed method is applied in real-world dyadic psychotherapy interactions between a provider and a patient and demonstrated to show improved results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2019

Linguistically Aided Speaker Diarization Using Speaker Role Information

Speaker diarization relies on the assumption that speech segments corres...
research
04/01/2022

Multimodal Clustering with Role Induced Constraints for Speaker Diarization

Speaker clustering is an essential step in conventional speaker diarizat...
research
05/30/2023

Language-independent speaker anonymization using orthogonal Householder neural network

Speaker anonymization aims to conceal a speaker's identity while preserv...
research
06/03/2019

Problem-Agnostic Speech Embeddings for Multi-Speaker Text-to-Speech with SampleRNN

Text-to-speech (TTS) acoustic models map linguistic features into an aco...
research
12/18/2018

Constrained speaker diarization of TV series based on visual patterns

Speaker diarization, usually denoted as the 'who spoke when' task, turns...
research
07/19/2023

An analysis on the effects of speaker embedding choice in non auto-regressive TTS

In this paper we introduce a first attempt on understanding how a non-au...
research
05/21/2018

Speaker Clustering Using Dominant Sets

Speaker clustering is the task of forming speaker-specific groups based ...

Please sign up or login with your details

Forgot password? Click here to reset