From Text to Sound: A Preliminary Study on Retrieving Sound Effects to Radio Stories

08/20/2019
by   Songwei Ge, et al.
0

Sound effects play an essential role in producing high-quality radio stories but require enormous labor cost to add. In this paper, we address the problem of automatically adding sound effects to radio stories with a retrieval-based model. However, directly implementing a tag-based retrieval model leads to high false positives due to the ambiguity of story contents. To solve this problem, we introduce a retrieval-based framework hybridized with a semantic inference model which helps to achieve robust retrieval results. Our model relies on fine-designed features extracted from the context of candidate triggers. We collect two story dubbing datasets through crowdsourcing to analyze the setting of adding sound effects and to train and test our proposed methods. We further discuss the importance of each feature and introduce several heuristic rules for the trade-off between precision and recall. Together with the text-to-speech technology, our results reveal a promising automatic pipeline on producing high-quality radio stories.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/17/2021

Soundify: Matching Sound Effects to Video

In the art of video editing, sound is really half the story. A skilled v...
research
08/17/2023

Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries

Finding the right sound effects (SFX) to match moments in a video is a d...
research
02/15/2022

Nonverbal Sound Detection for Disordered Speech

Voice assistants have become an essential tool for people with various d...
research
09/04/2019

Large-scale Tag-based Font Retrieval with Generative Feature Learning

Font selection is one of the most important steps in a design workflow. ...
research
06/20/2023

Align, Adapt and Inject: Sound-guided Unified Image Generation

Text-guided image generation has witnessed unprecedented progress due to...
research
02/09/2023

Robot Synesthesia: A Sound and Emotion Guided AI Painter

If a picture paints a thousand words, sound may voice a million. While r...

Please sign up or login with your details

Forgot password? Click here to reset