AI Chat AI Image Generator AI Video Text to Speech

Speaker Clustering With Neural Networks And Audio Processing

03/22/2018

∙

by Maxime Jumelle, et al.

∙

∙

Speaker clustering is the task of differentiating speakers in a recording. In a way, the aim is to answer "who spoke when" in audio recordings. A common method used in industry is feature extraction directly from the recording thanks to MFCC features, and by using well-known techniques such as Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM). In this paper, we studied neural networks (especially CNN) followed by clustering and audio processing in the quest to reach similar accuracy to state-of-the-art methods.

Maxime Jumelle
1 publication
Taqiyeddine Sakmeche
1 publication

research

∙ 09/10/2020

Speaker Diarization Using Stereo Audio Channels: Preliminary Study on Utterance Clustering

Speaker diarization is one of the actively researched topics in audio si...

0 Yingjun Dong, et al. ∙

research

∙ 03/17/2020

High-Resolution Speaker Counting In Reverberant Rooms Using CRNN With Ambisonics Features

Speaker counting is the task of estimating the number of people that are...

0 Pierre-Amaury Grumiaux, et al. ∙

research

∙ 12/30/2021

Feature extraction with mel scale separation method on noise audio recordings

This paper focuses on improving the accuracy of noise audio recordings. ...

0 Roy Rudolf Huizen, et al. ∙

research

∙ 11/15/2021

Machine Learning for Genomic Data

This report explores the application of machine learning techniques on s...

16 Akankshita Dash, et al. ∙

research

∙ 08/30/2019

Enhancements for Audio-only Diarization Systems

In this paper two different approaches to enhance the performance of the...

0 Dimitrios Dimitriadis, et al. ∙

research

∙ 02/23/2022

Speech watermarking: an approach for the forensic analysis of digital telephonic recordings

In this article, the authors discuss the problem of forensic authenticat...

0 Marcos Faundez-Zanuy, et al. ∙

research

∙ 07/01/2022

Speaker Diarization and Identification from Single-Channel Classroom Audio Recording Using Virtual Microphones

Speaker identification in noisy audio recordings, specifically those fro...

0 Antonio Gomez, et al. ∙