Diarization of Legal Proceedings. Identifying and Transcribing Judicial Speech from Recorded Court Audio

04/03/2021
by   Jeffrey Tumminia, et al.
0

United States Courts make audio recordings of oral arguments available as public record, but these recordings rarely include speaker annotations. This paper addresses the Speech Audio Diarization problem, answering the question of "Who spoke when?" in the domain of judicial oral argument proceedings. We present a workflow for diarizing the speech of judges using audio recordings of oral arguments, a process we call Reference-Dependent Speaker Verification. We utilize a speech embedding network trained with the Generalized End-to-End Loss to encode speech into d-vectors and a pre-defined reference audio library based on annotated data. We find that by encoding reference audio for speakers and full arguments and computing similarity scores we achieve a 13.8 Error Rate for speakers covered by the reference audio library on a held-out test set. We evaluate our method on the Supreme Court of the United States oral arguments, accessed through the Oyez Project, and outline future work for diarizing legal proceedings. A code repository for this research is available at github.com/JeffT13/rd-diarization

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2020

Neural Speaker Diarization with Speaker-Wise Chain Rule

Speaker diarization is an essential step for processing multi-speaker au...
research
09/15/2020

Pardon the Interruption: An Analysis of Gender and Turn-Taking in U.S. Supreme Court Oral Arguments

This study presents a corpus of turn changes between speakers in U.S. Su...
research
05/22/2023

Target Active Speaker Detection with Audio-visual Cues

In active speaker detection (ASD), we would like to detect whether an on...
research
11/02/2022

Towards End-to-end Speaker Diarization in the Wild

Speaker diarization algorithms address the "who spoke when" problem in a...
research
12/26/2021

Bilingual Speech Recognition by Estimating Speaker Geometry from Video Data

Speech recognition is very challenging in student learning environments ...
research
12/10/2019

Advances in Online Audio-Visual Meeting Transcription

This paper describes a system that generates speaker-annotated transcrip...
research
09/14/2023

SpatialCodec: Neural Spatial Speech Coding

In this work, we address the challenge of encoding speech captured by a ...

Please sign up or login with your details

Forgot password? Click here to reset