Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion

11/27/2018
by   Suwon Shon, et al.
0

In this paper, we present a multi-modal online person verification system using both speech and visual signals. Inspired by neuroscientific findings on the association of voice and face, we propose an attention-based end-to-end neural network that learns multi-sensory associations for the task of person verification. The attention mechanism in our proposed network learns to conditionally select a salient modality between speech and facial representations that provides a balance between complementary inputs. By virtue of this capability, the network is robust to missing or corrupted data from either modality. In the VoxCeleb2 dataset, we show that our method performs favorably against competing multi-modal methods. Even for extreme cases of large corruption or an entirely missing modality, our method demonstrates robustness over other unimodal methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/24/2019

Audio-Visual Kinship Verification

Visual kinship verification entails confirming whether or not two indivi...
research
03/16/2020

Multi-modal Multi-channel Target Speech Separation

Target speech separation refers to extracting a target speaker's voice f...
research
07/21/2021

Multi-modal Residual Perceptron Network for Audio-Video Emotion Recognition

Audio-Video Emotion Recognition is now attacked with Deep Neural Network...
research
09/09/2022

Learning Audio-Visual embedding for Person Verification in the Wild

It has already been observed that audio-visual embedding is more robust ...
research
05/25/2023

Dynamic Enhancement Network for Partial Multi-modality Person Re-identification

Many existing multi-modality studies are based on the assumption of moda...
research
10/23/2021

A Study of Multimodal Person Verification Using Audio-Visual-Thermal Data

In this paper, we study an approach to multimodal person verification us...
research
09/07/2021

Deep Collaborative Multi-Modal Learning for Unsupervised Kinship Estimation

Kinship verification is a long-standing research challenge in computer v...

Please sign up or login with your details

Forgot password? Click here to reset