BER: Balanced Error Rate For Speaker Diarization

11/08/2022
by   Tao Liu, et al.
5

DER is the primary metric to evaluate diarization performance while facing a dilemma: the errors in short utterances or segments tend to be overwhelmed by longer ones. Short segments, e.g., `yes' or `no,' still have semantic information. Besides, DER overlooks errors in less-talked speakers. Although JER balances speaker errors, it still suffers from the same dilemma. Considering all those aspects, duration error, segment error, and speaker-weighted error constituting a complete diarization evaluation, we propose a Balanced Error Rate (BER) to evaluate speaker diarization. First, we propose a segment-level error rate (SER) via connected sub-graphs and adaptive IoU threshold to get accurate segment matching. Second, to evaluate diarization in a unified way, we adopt a speaker-specific harmonic mean between duration and segment, followed by a speaker-weighted average. Third, we analyze our metric via the modularized system, EEND, and the multi-modal method on real datasets. SER and BER are publicly available at https://github.com/X-LANCE/BER.

READ FULL TEXT
research
07/22/2019

A Deep Neural Network for Short-Segment Speaker Recognition

Todays interactive devices such as smart-phone assistants and smart spea...
research
11/01/2018

Deep Segment Attentive Embedding for Duration Robust Speaker Verification

LSTM-based speaker verification usually uses a fixed-length local segmen...
research
05/19/2022

Bi-LSTM Scoring Based Similarity Measurement with Agglomerative Hierarchical Clustering (AHC) for Speaker Diarization

Majority of speech signals across different scenarios are never availabl...
research
10/22/2020

Analysis of the BUT Diarization System for VoxConverse Challenge

This paper describes the system developed by the BUT team for the fourth...
research
09/25/2022

Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward

Advertisement video editing aims to automatically edit advertising video...
research
05/28/2023

Range-Based Equal Error Rate for Spoof Localization

Spoof localization, also called segment-level detection, is a crucial ta...
research
10/28/2022

Speaker recognition with two-step multi-modal deep cleansing

Neural network-based speaker recognition has achieved significant improv...

Please sign up or login with your details

Forgot password? Click here to reset