Zero-Shot Activity Recognition with Videos

01/22/2020
by   Evin Pınar Örnek, et al.
0

In this paper, we examined the zero-shot activity recognition task with the usage of videos. We introduce an auto-encoder based model to construct a multimodal joint embedding space between the visual and textual manifolds. On the visual side, we used activity videos and a state-of-the-art 3D convolutional action recognition network to extract the features. On the textual side, we worked with GloVe word embeddings. The zero-shot recognition results are evaluated by top-n accuracy. Then, the manifold learning ability is measured by mean Nearest Neighbor Overlap. In the end, we provide an extensive discussion over the results and the future directions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/29/2017

Zero-Shot Activity Recognition with Verb Attribute Induction

In this paper, we investigate large-scale zero-shot activity recognition...
research
09/13/2019

Zero-Shot Action Recognition in Videos: A Survey

Zero-Shot Action Recognition has attracted attention in the last years, ...
research
12/06/2018

Zero-Shot Anticipation for Instructional Activities

How can we teach a robot to predict what will happen next for an activit...
research
06/21/2018

Learning Shared Multimodal Embeddings with Unpaired Data

In this paper, we propose a method to learn a joint multimodal embedding...
research
04/23/2020

Signal Level Deep Metric Learning for Multimodal One-Shot Action Recognition

Recognizing an activity with a single reference sample using metric lear...
research
09/02/2021

Multi-Modal Zero-Shot Sign Language Recognition

Zero-Shot Learning (ZSL) has rapidly advanced in recent years. Towards o...
research
12/08/2019

Zero-shot Recognition of Complex Action Sequences

Zero-shot video classification for fine-grained activity recognition has...

Please sign up or login with your details

Forgot password? Click here to reset