Yerbolat Khassanov

research

∙ 05/25/2023

Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration

This work aims to build a multilingual text-to-speech (TTS) synthesis sy...

0 Rustem Yeshpanov, et al. ∙

research

∙ 10/28/2022

Improving short-video speech recognition using random utterance concatenation

One of the limitations in end-to-end automatic speech recognition framew...

0 Haihua Xu, et al. ∙

research

∙ 01/15/2022

KazakhTTS2: Extending the Open-Source Kazakh TTS Corpus With More Data, Speakers, and Topics

We present an expanded version of our previously released Kazakh text-to...

0 Saida Mussakhojayeva, et al. ∙

research

∙ 11/26/2021

KazNERD: Kazakh Named Entity Recognition Dataset

We present the development of a dataset for Kazakh named entity recognit...

0 Rustem Yeshpanov, et al. ∙

research

∙ 10/23/2021

A Study of Multimodal Person Verification Using Audio-Visual-Thermal Data

In this paper, we study an approach to multimodal person verification us...

0 Madina Abdrakhmanova, et al. ∙

research

∙ 08/03/2021

A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and English

We study training a single end-to-end (E2E) automatic speech recognition...

0 Saida Mussakhojayeva, et al. ∙

research

∙ 07/30/2021

USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments

We present a freely available speech corpus for the Uzbek language and r...

0 Muhammadjon Musaev, et al. ∙

research

∙ 04/17/2021

KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset

This paper introduces a high-quality open-source speech synthesis datase...

0 Saida Mussakhojayeva, et al. ∙

research

∙ 12/05/2020

SpeakingFaces: A Large-Scale Multimodal Dataset of Voice Commands with Visual and Thermal Video Streams

We present SpeakingFaces as a publicly-available large-scale multimodal ...

0 Madina Abdrakhmanova, et al. ∙

research

∙ 10/23/2020

Enriching Under-Represented Named-Entities To Improve Speech Recognition Performance

Automatic speech recognition (ASR) for under-represented named-entity (U...

0 Tingzhi Mao, et al. ∙

research

∙ 09/22/2020

A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline

We present an open-source speech corpus for the Kazakh language. The Kaz...

0 Yerbolat Khassanov, et al. ∙

research

∙ 05/21/2020

Leveraging Text Data Using Hybrid Transformer-LSTM Based End-to-End ASR in Transfer Learning

In this work, we study leveraging extra text data to improve low-resourc...

0 Zhiping Zeng, et al. ∙

research

∙ 05/18/2020

Approaches to Improving Recognition of Underrepresented Named Entities in Hybrid ASR Systems

In this paper, we present a series of complementary approaches to improv...

0 Tingzhi Mao, et al. ∙

research

∙ 11/25/2019

Independent language modeling architecture for end-to-end ASR

The attention-based end-to-end (E2E) automatic speech recognition (ASR) ...

0 Van Tung Pham, et al. ∙

research

∙ 04/08/2019

Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data

The lack of code-switch training data is one of the major concerns in th...

0 Yerbolat Khassanov, et al. ∙

research

∙ 04/08/2019

Enriching Rare Word Representations in Neural Language Models by Embedding Matrix Augmentation

The neural language models (NLM) achieve strong generalization capabilit...

0 Yerbolat Khassanov, et al. ∙

research

∙ 11/01/2018

On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition

Code-switching (CS) refers to a linguistic phenomenon where a speaker us...

0 Zhiping Zeng, et al. ∙

research

∙ 06/27/2018

Unsupervised and Efficient Vocabulary Expansion for Recurrent Neural Network Language Models in ASR

In automatic speech recognition (ASR) systems, recurrent neural network ...

0 Yerbolat Khassanov, et al. ∙

Yerbolat Khassanov

Featured Co-authors

Sign in with Google

Consider DeepAI Pro