Deep Audio-Visual Learning: A Survey

01/14/2020
by   Mandi Luo, et al.
31

Audio-visual learning, aimed at exploiting the relationship between audio and visual modalities, has drawn considerable attention since deep learning started to be used successfully. Researchers tend to leverage these two modalities either to improve the performance of previously considered single-modality tasks or to address new challenging problems. In this paper, we provide a comprehensive survey of recent audio-visual learning development. We divide the current audio-visual learning tasks into four different subfields: audio-visual separation and localization, audio-visual correspondence learning, audio-visual generation, and audio-visual representation learning. State-of-the-art methods as well as the remaining challenges of each subfield are further discussed. Finally, we summarize the commonly used datasets and performance metrics.

READ FULL TEXT

page 1

page 2

page 7

page 9

page 12

page 13

research
08/20/2022

Learning in Audio-visual Context: A Review, Analysis, and New Perspective

Sight and hearing are two senses that play a vital role in human communi...
research
07/10/2023

A Demand-Driven Perspective on Generative Audio AI

To achieve successful deployment of AI research, it is crucial to unders...
research
08/01/2021

A Survey on Audio Synthesis and Audio-Visual Multimodal Processing

With the development of deep learning and artificial intelligence, audio...
research
02/28/2022

Recent Advances and Challenges in Deep Audio-Visual Correlation Learning

Audio-visual correlation learning aims to capture essential corresponden...
research
08/21/2023

Audio-Visual Class-Incremental Learning

In this paper, we introduce audio-visual class-incremental learning, a c...
research
01/21/2018

Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers

Deep learning has recently seen rapid development and significant attent...
research
07/27/2021

The CORSMAL benchmark for the prediction of the properties of containers

Acoustic and visual sensing can support the contactless estimation of th...

Please sign up or login with your details

Forgot password? Click here to reset