Unsupervised Discovery of Structured Acoustic Tokens with Applications to Spoken Term Detection

11/28/2017
by   Cheng-Tao Chung, et al.
0

In this paper, we compare two paradigms for unsupervised discovery of structured acoustic tokens directly from speech corpora without any human annotation. The Multigranular Paradigm seeks to capture all available information in the corpora with multiple sets of tokens for different model granularities. The Hierarchical Paradigm attempts to jointly learn several levels of signal representations in a hierarchical structure. The two paradigms are unified within a theoretical framework in this paper. Query-by-Example Spoken Term Detection (QbE-STD) experiments on the QUESST dataset of MediaEval 2015 verifies the competitiveness of the acoustic tokens. The Enhanced Relevance Score (ERS) proposed in this work improves both paradigms for the task of QbE-STD. We also list results on the ABX evaluation task of the Zero Resource Challenge 2015 for comparison of the Paradigms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/17/2017

Unsupervised Iterative Deep Learning of Speech Features and Acoustic Tokens with Applications to Spoken Term Detection

In this paper we aim to automatically discover high quality frame-level ...
research
04/01/2018

Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings

Unsupervised discovery of acoustic tokens from audio corpora without ann...
research
06/20/2016

A Nonparametric Bayesian Approach for Spoken Term detection by Example Query

State of the art speech recognition systems use data-intensive context-d...
research
09/07/2015

Unsupervised Spoken Term Detection with Spoken Queries by Multi-level Acoustic Patterns with Varying Model Granularity

This paper presents a new approach for unsupervised Spoken Term Detectio...

Please sign up or login with your details

Forgot password? Click here to reset