Identification of Indian Languages using Ghost-VLAD pooling

02/05/2020
by   Krishna D N, et al.
0

In this work, we propose a new pooling strategy for language identification by considering Indian languages. The idea is to obtain utterance level features for any variable length audio for robust language recognition. We use the GhostVLAD approach to generate an utterance level feature vector for any variable length input audio by aggregating the local frame level features across time. The generated feature vector is shown to have very good language discriminative features and helps in getting state of the art results for language identification task. We conduct our experiments on 635Hrs of audio data for 7 Indian languages. Our method outperforms the previous state of the art x-vector [11] method by an absolute improvement of 1.88 achieves 98.43 various pooling approaches and show that GhostVLAD is the best pooling approach for this task. We also provide visualization of the utterance level embeddings generated using Ghost-VLAD pooling and show that this method creates embeddings which has very good language discriminative features.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/24/2019

Self Multi-Head Attention for Speaker Recognition

Most state-of-the-art Deep Learning (DL) approaches for speaker recognit...
research
02/20/2019

Utterance-level end-to-end language identification using attention-based CNN-BLSTM

In this paper, we present an end-to-end language identification framewor...
research
06/28/2022

Attention-based conditioning methods using variable frame rate for style-robust speaker verification

We propose an approach to extract speaker embeddings that are robust to ...
research
05/28/2021

EDEN: Deep Feature Distribution Pooling for Saimaa Ringed Seals Pattern Matching

In this paper, pelage pattern matching is considered to solve the indivi...
research
07/19/2017

Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data

Audio Word2Vec offers vector representations of fixed dimensionality for...
research
04/02/2018

Insights into End-to-End Learning Scheme for Language Identification

A novel interpretable end-to-end learning scheme for language identifica...
research
09/18/2021

Fast query-by-example speech search using separable model

Traditional Query-by-Example (QbE) speech search approaches usually use ...

Please sign up or login with your details

Forgot password? Click here to reset