Smoothed Gaussian Mixture Models for Video Classification and Recommendation

12/17/2020
by   Sirjan Kafle, et al.
0

Cluster-and-aggregate techniques such as Vector of Locally Aggregated Descriptors (VLAD), and their end-to-end discriminatively trained equivalents like NetVLAD have recently been popular for video classification and action recognition tasks. These techniques operate by assigning video frames to clusters and then representing the video by aggregating residuals of frames with respect to the mean of each cluster. Since some clusters may see very little video-specific data, these features can be noisy. In this paper, we propose a new cluster-and-aggregate method which we call smoothed Gaussian mixture model (SGMM), and its end-to-end discriminatively trained equivalent, which we call deep smoothed Gaussian mixture model (DSGMM). SGMM represents each video by the parameters of a Gaussian mixture model (GMM) trained for that video. Low-count clusters are addressed by smoothing the video-specific estimates with a universal background model (UBM) trained on a large number of videos. The primary benefit of SGMM over VLAD is smoothing which makes it less sensitive to small number of training samples. We show, through extensive experiments on the YouTube-8M classification task, that SGMM/DSGMM is consistently better than VLAD/NetVLAD by a small but statistically significant margin. We also show results using a dataset created at LinkedIn to predict if a member will watch an uploaded video.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2009

Scale-Based Gaussian Coverings: Combining Intra and Inter Mixture Models in Image Segmentation

By a "covering" we mean a Gaussian mixture model fit to observed data. A...
research
10/03/2020

EGMM: an Evidential Version of the Gaussian Mixture Model for Clustering

The Gaussian mixture model (GMM) provides a convenient yet principled fr...
research
12/05/2019

A sparse negative binomial mixture model for clustering RNA-seq count data

Clustering with variable selection is a challenging but critical task fo...
research
06/01/2021

ClustRank: a Visual Quality Measure Trained on Perceptual Data for Sorting Scatterplots by Cluster Patterns

Visual quality measures (VQMs) are designed to support analysts by autom...
research
08/22/2023

A Study of Particle Motion in the Presence of Clusters

The motivation for this study came from the task of analysing the kineti...
research
05/18/2022

Pluralistic Image Completion with Probabilistic Mixture-of-Experts

Pluralistic image completion focuses on generating both visually realist...
research
08/24/2022

Apple Counting using Convolutional Neural Networks

Estimating accurate and reliable fruit and vegetable counts from images ...

Please sign up or login with your details

Forgot password? Click here to reset