AI Chat AI Image Generator AI Video Text to Speech

A Multi-Modal Approach to Infer Image Affect

03/13/2018

∙

by Ashok Sundaresan, et al.

∙

∙

The group affect or emotion in an image of people can be inferred by extracting features about both the people in the picture and the overall makeup of the scene. The state-of-the-art on this problem investigates a combination of facial features, scene extraction and even audio tonality. This paper combines three additional modalities, namely, human pose, text-based tagging and CNN extracted features / predictions. To the best of our knowledge, this is the first time all of the modalities were extracted using deep neural networks. We evaluate the performance of our approach against baselines and identify insights throughout this paper.

Ashok Sundaresan
1 publication
Sugumar Murugesan
1 publication
Sean Davis
1 publication
Karthik Kappaganthu
1 publication
ZhongYi Jin
1 publication
Divya Jain
1 publication
Anurag Maunder
1 publication

page 3

page 5

research

∙ 10/12/2016

Analyzing the Affect of a Group of People Using Multi-modal Framework

Millions of images on the web enable us to explore images from social ev...

0 Xiaohua Huang, et al. ∙

research

∙ 04/11/2019

Learning to Take Good Pictures of People with a Robot Photographer

We present a robotic system capable of navigating autonomously by follow...

0 Rhys Newbury, et al. ∙

research

∙ 02/26/2020

Multi-Modal Continuous Valence And Arousal Prediction in the Wild Using Deep 3D Features and Sequence Modeling

Continuous affect prediction in the wild is a very interesting problem a...

0 Sowmya Rasipuram, et al. ∙

research

∙ 05/17/2018

Affective computing using speech and eye gaze: a review and bimodal system proposal for continuous affect prediction

Speech has been a widely used modality in the field of affective computi...

0 Jonny O'Dwyer, et al. ∙

research

∙ 09/23/2020

Attention Driven Fusion for Multi-Modal Emotion Recognition

Deep learning has emerged as a powerful alternative to hand-crafted meth...

0 Darshana Priyasad, et al. ∙

research

∙ 02/27/2019

Dynamic Deep Multi-modal Fusion for Image Privacy Prediction

With millions of images that are shared online on social networking site...

0 Ashwini Tonge, et al. ∙