A Discriminative Vectorial Framework for Multi-modal Feature Representation

03/09/2021
by   Lei Gao, et al.
11

Due to the rapid advancements of sensory and computing technology, multi-modal data sources that represent the same pattern or phenomenon have attracted growing attention. As a result, finding means to explore useful information from these multi-modal data sources has quickly become a necessity. In this paper, a discriminative vectorial framework is proposed for multi-modal feature representation in knowledge discovery by employing multi-modal hashing (MH) and discriminative correlation maximization (DCM) analysis. Specifically, the proposed framework is capable of minimizing the semantic similarity among different modalities by MH and exacting intrinsic discriminative representations across multiple data sources by DCM analysis jointly, enabling a novel vectorial framework of multi-modal feature representation. Moreover, the proposed feature representation strategy is analyzed and further optimized based on canonical and non-canonical cases, respectively. Consequently, the generated feature representation leads to effective utilization of the input data sources of high quality, producing improved, sometimes quite impressive, results in various applications. The effectiveness and generality of the proposed framework are demonstrated by utilizing classical features and deep neural network (DNN) based features with applications to image and multimedia analysis and recognition tasks, including data visualization, face recognition, object recognition; cross-modal (text-image) recognition and audio emotion recognition. Experimental results show that the proposed solutions are superior to state-of-the-art statistical machine learning (SML) and DNN algorithms.

READ FULL TEXT

page 3

page 4

page 5

page 6

page 7

page 9

page 10

page 12

research
07/21/2021

Multi-modal Residual Perceptron Network for Audio-Video Emotion Recognition

Audio-Video Emotion Recognition is now attacked with Deep Neural Network...
research
06/05/2022

M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation

Emotion Recognition in Conversations (ERC) is crucial in developing symp...
research
11/12/2018

Holistic Multi-modal Memory Network for Movie Question Answering

Answering questions according to multi-modal context is a challenging pr...
research
01/15/2022

Tailor Versatile Multi-modal Learning for Multi-label Emotion Recognition

Multi-modal Multi-label Emotion Recognition (MMER) aims to identify vari...
research
10/26/2017

Deep Multi-Modal Classification of Intraductal Papillary Mucinous Neoplasms (IPMN) with Canonical Correlation Analysis

Pancreatic cancer has the poorest prognosis among all cancer types. Intr...
research
10/25/2020

Multi-Graph Tensor Networks

The irregular and multi-modal nature of numerous modern data sources pos...
research
08/16/2021

Efficient Feature Representations for Cricket Data Analysis using Deep Learning based Multi-Modal Fusion Model

Data analysis has become a necessity in the modern era of cricket. Every...

Please sign up or login with your details

Forgot password? Click here to reset