A Comprehensive Survey of Automated Audio Captioning

05/11/2022
by   Xuenan Xu, et al.
0

Automated audio captioning, a task that mimics human perception as well as innovatively links audio processing and natural language processing, has overseen much progress over the last few years. Audio captioning requires recognizing the acoustic scene, primary audio events and sometimes the spatial and temporal relationship between events in an audio clip. It also requires describing these elements by a fluent and vivid sentence. Deep learning-based approaches are widely adopted to tackle this problem. This current paper situates itself as a comprehensive review covering the benchmark datasets, existing deep learning techniques and the evaluation metrics in automated audio captioning.

READ FULL TEXT
research
05/12/2022

Automated Audio Captioning: an Overview of Recent Progress and New Challenges

Automated audio captioning is a cross-modal translation task that aims t...
research
07/01/2020

The NTT DCASE2020 Challenge Task 6 system: Automated Audio Captioning with Keywords and Sentence Length Estimation

This technical report describes the system participating to the Detectio...
research
06/02/2023

Enhance Temporal Relations in Audio Captioning with Sound Event Detection

Automated audio captioning aims at generating natural language descripti...
research
07/08/2022

Automated Audio Captioning and Language-Based Audio Retrieval

This project involved participation in the DCASE 2022 Competition (Task ...
research
02/23/2021

Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning

Automated audio captioning (AAC) aims at generating summarizing descript...
research
08/20/2022

Learning in Audio-visual Context: A Review, Analysis, and New Perspective

Sight and hearing are two senses that play a vital role in human communi...
research
10/03/2022

Text-to-Audio Grounding Based Novel Metric for Evaluating Audio Caption Similarity

Automatic Audio Captioning (AAC) refers to the task of translating an au...

Please sign up or login with your details

Forgot password? Click here to reset