DeepAI AI Chat
Log In Sign Up

Discriminatively Trained Latent Ordinal Model for Video Classification

by   Karan Sikka, et al.

We study the problem of video classification for facial analysis and human action recognition. We propose a novel weakly supervised learning method that models the video as a sequence of automatically mined, discriminative sub-events (eg. onset and offset phase for "smile", running and jumping for "highjump"). The proposed model is inspired by the recent works on Multiple Instance Learning and latent SVM/HCRF -- it extends such frameworks to model the ordinal aspect in the videos, approximately. We obtain consistent improvements over relevant competitive baselines on four challenging and publicly available video based facial analysis datasets for prediction of expression, clinical pain and intent in dyadic conversations and on three challenging human action datasets. We also validate the method with qualitative results and show that they largely support the intuitions behind the method.


page 9

page 10


LOMo: Latent Ordinal Model for Facial Analysis in Videos

We study the problem of facial analysis in videos. We propose a novel we...

Learning from Video and Text via Large-Scale Discriminative Clustering

Discriminative clustering has been successfully applied to a number of w...

Learning Pain from Action Unit Combinations: A Weakly Supervised Approach via Multiple Instance Learning

Facial pain expression is an important modality for assessing pain, espe...

Multi-Instance Dynamic Ordinal Random Fields for Weakly-supervised Facial Behavior Analysis

We propose a Multi-Instance-Learning (MIL) approach for weakly-supervise...

LARNet: Latent Action Representation for Human Action Synthesis

We present LARNet, a novel end-to-end approach for generating human acti...

Action Modifiers: Learning from Adverbs in Instructional Videos

We present a method to learn a representation for adverbs from instructi...