DeepAI AI Chat
Log In Sign Up

Computational Pronunciation Analysis in Sung Utterances

06/21/2021
by   Emir Demirel, et al.
0

Recent automatic lyrics transcription (ALT) approaches focus on building stronger acoustic models or in-domain language models, while the pronunciation aspect is seldom touched upon. This paper applies a novel computational analysis on the pronunciation variances in sung utterances and further proposes a new pronunciation model adapted for singing. The singing-adapted model is tested on multiple public datasets via word recognition experiments. It performs better than the standard speech dictionary in all settings reporting the best results on ALT in a capella recordings using n-gram language models. For reproducibility, we share the sentence-level annotations used in testing, providing a new benchmark evaluation set for ALT.

READ FULL TEXT
09/06/2022

ASR2K: Speech Recognition for Around 2000 Languages without Audio

Most recent speech recognition models rely on large supervised datasets,...
04/06/2022

SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis

In this work, we present the SOMOS dataset, the first large-scale mean o...
09/14/2018

Visual Speech Language Models

Language models (LM) are very powerful in lipreading systems. Language m...
08/16/2018

Automatic Chord Recognition with Higher-Order Harmonic Language Modelling

Common temporal models for automatic chord recognition model chord chang...
05/16/2019

Effective Sentence Scoring Method using Bidirectional Language Model for Speech Recognition

In automatic speech recognition, many studies have shown performance imp...
03/30/2016

Model Interpolation with Trans-dimensional Random Field Language Models for Speech Recognition

The dominant language models (LMs) such as n-gram and neural network (NN...
03/16/2023

Exploring Distributional Shifts in Large Language Models for Code Analysis

We systematically study the capacity of two large language models for co...