Better Transcription of UK Supreme Court Hearings

11/29/2022
by   Hadeel Saadany, et al.
0

Transcription of legal proceedings is very important to enable access to justice. However, speech transcription is an expensive and slow process. In this paper we describe part of a combined research and industrial project for building an automated transcription tool designed specifically for the Justice sector in the UK. We explain the challenges involved in transcribing court room hearings and the Natural Language Processing (NLP) techniques we employ to tackle these challenges. We will show that fine-tuning a generic off-the-shelf pre-trained Automatic Speech Recognition (ASR) system with an in-domain language model as well as infusing common phrases extracted with a collocation detection model can improve not only the Word Error Rate (WER) of the transcribed hearings but avoid critical errors that are specific of the legal jargon and terminology commonly used in British courts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2022

Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition

Self-supervised learning (SSL) is a powerful tool that allows learning o...
research
11/16/2021

Integrated Semantic and Phonetic Post-correction for Chinese Speech Recognition

Due to the recent advances of natural language processing, several works...
research
02/10/2022

Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding

ASR systems designed for native English (L1) usually underperform on non...
research
10/05/2021

ASR Rescoring and Confidence Estimation with ELECTRA

In automatic speech recognition (ASR) rescoring, the hypothesis with the...
research
11/16/2018

Investigating the Effects of Word Substitution Errors on Sentence Embeddings

A key initial step in several natural language processing (NLP) tasks in...
research
02/18/2021

Fixing Errors of the Google Voice Recognizer through Phonetic Distance Metrics

Speech recognition systems for the Spanish language, such as Google's, p...

Please sign up or login with your details

Forgot password? Click here to reset