b'Erik Visser'

research

∙ 09/06/2023

Highly Controllable Diffusion-based Any-to-Any Voice Conversion Model with Frame-level Prosody Feature

We propose a highly controllable voice manipulation system that can perf...

0 Kyungguen Byun, et al. ∙

research

∙ 09/06/2023

Parameter Efficient Audio Captioning With Faithful Guidance Using Audio-text Shared Latent Representation

There has been significant research on developing pretrained transformer...

0 Arvind krishna Sridhar, et al. ∙

research

∙ 09/06/2023

Detecting False Alarms and Misses in Audio Captions

Metrics to evaluate audio captions simply provide a score without much e...

0 Rehana Mahfuz, et al. ∙

research

∙ 09/06/2023

Stylebook: Content-Dependent Speaking Style Modeling for Any-to-Any Voice Conversion using Only Speech Data

While many recent any-to-any voice conversion models succeed in transfer...

0 Hyungseob Lim, et al. ∙

research

∙ 12/06/2022

Improved Beam Search for Hallucination Mitigation in Abstractive Summarization

Advancement in large pretrained language models has significantly improv...

0 Arvind krishna Sridhar, et al. ∙

research

∙ 10/29/2022

Application of Knowledge Distillation to Multi-task Speech Representation Learning

Model architectures such as wav2vec 2.0 and HuBERT have been proposed to...

0 Mine Kerpicci, et al. ∙

research

∙ 10/29/2022

Improving Audio Captioning Using Semantic Similarity Metrics

Audio captioning quality metrics which are typically borrowed from the m...

0 Rehana Mahfuz, et al. ∙

research

∙ 09/09/2022

Activity report analysis with automatic single or multispan answer extraction

In the era of loT (Internet of Things) we are surrounded by a plethora o...

0 Ravi Choudhary, et al. ∙

research

∙ 10/03/2021

Multi-task Voice Activated Framework using Self-supervised Learning

Self-supervised learning methods such as wav2vec 2.0 have shown promisin...

9 Shehzeen Hussain, et al. ∙

research

∙ 03/26/2020

Incremental Learning Algorithm for Sound Event Detection

This paper presents a new learning strategy for the Sound Event Detectio...

0 Eunjeong Koh, et al. ∙

Erik Visser

Featured Co-authors

Sign in with Google

Consider DeepAI Pro