Hoirin Kim

research

∙ 05/19/2023

Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation

Transformer-based speech self-supervised learning (SSL) models, such as ...

1 Kangwook Jang, et al. ∙

research

∙ 10/26/2022

Deep Metric Learning with Adaptive Margin and Adaptive Scale for Acoustic Word Discrimination

Many recent loss functions in deep metric learning are expressed with lo...

0 Myunghun Jung, et al. ∙

research

∙ 07/01/2022

FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning

Large-scale speech self-supervised learning (SSL) has emerged to the mai...

0 Yeonghyeon Lee, et al. ∙

research

∙ 04/04/2022

Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck

Recent advances in sophisticated synthetic speech generated from text-to...

0 Youngsik Eom, et al. ∙

research

∙ 03/30/2022

Asymmetric Proxy Loss for Multi-View Acoustic Word Embeddings

Acoustic word embeddings (AWEs) are discriminative representations of sp...

0 Myunghun Jung, et al. ∙

research

∙ 11/10/2020

Supervised attention for speaker recognition

The recently proposed self-attentive pooling (SAP) has shown good perfor...

0 Seong Min Kye, et al. ∙

research

∙ 11/02/2020

Perceptually Guided End-to-End Text-to-Speech

Several fast text-to-speech (TTS) models have been proposed for real-tim...

0 Yeunju Choi, et al. ∙

research

∙ 10/06/2020

A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments

Speaker verification (SV) has recently attracted considerable research i...

0 Youngmoon Jung, et al. ∙

research

∙ 08/09/2020

Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling

While deep learning has made impressive progress in speech synthesis and...

0 Yeunju Choi, et al. ∙

research

∙ 07/16/2020

Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning With Spoofing Detection and Spoofing Type Classification

Several papers have proposed deep-learning-based models to predict the m...

0 Yeunju Choi, et al. ∙

research

∙ 06/12/2020

Neural voice cloning with a few low-quality samples

In this paper, we explore the possibility of speech synthesis from low q...

0 Sunghee Jung, et al. ∙

research

∙ 05/21/2020

Pitchtron: Towards audiobook generation from ordinary people's voices

In this paper, we explore prosody transfer for audiobook generation unde...

0 Sunghee Jung, et al. ∙

research

∙ 05/08/2020

Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention

Keyword spotting (KWS) and speaker verification (SV) have been studied i...

0 Myunghun Jung, et al. ∙

research

∙ 04/07/2020

Improving Multi-Scale Aggregation Using Feature Pyramid Module for Robust Speaker Verification of Variable-Duration Utterances

Currently, the most widely used approach for speaker verification is the...

0 Youngmoon Jung, et al. ∙

research

∙ 04/07/2020

Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification

Currently, the most widely used approach for speaker verification is the...

0 Youngmoon Jung, et al. ∙

research

∙ 04/06/2020

Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs

In realistic settings, a speaker recognition system needs to identify a ...

1 Seong Min Kye, et al. ∙

research

∙ 02/27/2020

Transductive Few-shot Learning with Meta-Learned Confidence

We propose a novel transductive inference framework for metric-based met...

0 Seong Min Kye, et al. ∙

research

∙ 10/01/2019

Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings

Acoustic word embeddings — fixed-dimensional vector representations of a...

0 Myunghun Jung, et al. ∙

research

∙ 09/26/2019

Self-Adaptive Soft Voice Activity Detection using Deep Neural Networks for Robust Speaker Verification

Voice activity detection (VAD), which classifies frames as speech or non...

0 Youngmoon Jung, et al. ∙

research

∙ 06/19/2019

Spatial Pyramid Encoding with Convex Length Normalization for Text-Independent Speaker Verification

In this paper, we propose a new pooling method called spatial pyramid en...

0 Youngmoon Jung, et al. ∙

research

∙ 11/07/2018

Learning acoustic word embeddings with phonetically associated triplet network

Previous researches on acoustic word embeddings used in query-by-example...

0 Hyungjun Lim, et al. ∙

Hoirin Kim

Featured Co-authors

Sign in with Google

Consider DeepAI Pro