Leveraging Speaker Embeddings with Adversarial Multi-task Learning for Age Group Classification

01/22/2023
by   Kwangje Baeg, et al.
0

Recently, researchers have utilized neural network-based speaker embedding techniques in speaker-recognition tasks to identify speakers accurately. However, speaker-discriminative embeddings do not always represent speech features such as age group well. In an embedding model that has been highly trained to capture speaker traits, the task of age group classification is closer to speech information leakage. Hence, to improve age group classification performance, we consider the use of speaker-discriminative embeddings derived from adversarial multi-task learning to align features and reduce the domain discrepancy in age subgroups. In addition, we investigated different types of speaker embeddings to learn and generalize the domain-invariant representations for age groups. Experimental results on the VoxCeleb Enrichment dataset verify the effectiveness of our proposed adaptive adversarial network in multi-objective scenarios and leveraging speaker embeddings for the domain adaptation task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2020

Leveraging speaker attribute information using multi task learning for speaker verification and diarization

Deep speaker embeddings have become the leading method for encoding spea...
research
10/19/2022

Speaker- and Age-Invariant Training for Child Acoustic Modeling Using Adversarial Multi-Task Learning

One of the major challenges in acoustic modelling of child speech is the...
research
04/13/2018

Speaker Embedding Extraction with Phonetic Information

Speaker embeddings achieve promising results on many speaker verificatio...
research
11/15/2017

Human and Machine Speaker Recognition Based on Short Trivial Events

Trivial events are ubiquitous in human to human conversations, e.g., cou...
research
10/25/2019

Learning Domain Invariant Representations for Child-Adult Classification from Speech

Diagnostic procedures for ASD (autism spectrum disorder) involve semi-na...
research
12/09/2018

To Reverse the Gradient or Not: An Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition

Transcribed datasets typically contain speaker identity for each instanc...
research
10/24/2021

Learning Speaker Representation with Semi-supervised Learning approach for Speaker Profiling

Speaker profiling, which aims to estimate speaker characteristics such a...

Please sign up or login with your details

Forgot password? Click here to reset