Meta-learning for robust child-adult classification from speech

10/24/2019
by   Nithin Rao Koluguri, et al.
0

Computational modeling of naturalistic conversations in clinical applications has seen growing interest in the past decade. An important use-case involves child-adult interactions within the autism diagnosis and intervention domain. In this paper, we address a specific sub-problem of speaker diarization, namely child-adult speaker classification in such dyadic conversations with specified roles. Training a speaker classification system robust to speaker and channel conditions is challenging due to inherent variability in the speech within children and the adult interlocutors. In this work, we propose the use of meta-learning, in particular, prototypical networks which optimize a metric space across multiple tasks. By modeling every child-adult pair in the training set as a separate task during meta-training, we learn a representation with improved generalizability compared to conventional supervised learning. We demonstrate improvements over state-of-the-art speaker embeddings (x-vectors) under two evaluation settings: weakly supervised classification (up to 14.53 relative improvement in F1-scores) and clustering (up to relative 9.66 improvement in cluster purity). Our results show that protonets can potentially extract robust speaker embeddings for child-adult classification from speech.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/31/2020

Designing Neural Speaker Embeddings with Meta Learning

Neural speaker embeddings trained using classification objectives have d...
research
07/19/2020

Meta-learning with Latent Space Clustering in Generative Adversarial Network for Speaker Diarization

The performance of most speaker diarization systems with x-vector embedd...
research
12/12/2002

Robust Classification with Context-Sensitive Features

This paper addresses the problem of classifying observations when featur...
research
03/31/2022

Improved Relation Networks for End-to-End Speaker Verification and Identification

Speaker identification systems in a real-world scenario are tasked to id...
research
10/23/2019

Speaker Adaptive Training using Model Agnostic Meta-Learning

Speaker adaptive training (SAT) of neural network acoustic models learns...
research
03/29/2021

Improved Meta-learning training for Speaker Verification

Meta-learning (ML) has recently become a research hotspot in speaker ver...
research
12/12/2002

Exploiting Context When Learning to Classify

This paper addresses the problem of classifying observations when featur...

Please sign up or login with your details

Forgot password? Click here to reset