Semantics to Space(S2S): Embedding semantics into spatial space for zero-shot verb-object query inferencing

06/13/2019
by   Sungmin Eum, et al.
0

We present a novel deep zero-shot learning (ZSL) model for inferencing human-object-interaction with verb-object (VO) query. While the previous ZSL approaches only use the semantic/textual information to be fed into the query stream, we seek to incorporate and embed the semantics into the visual representation stream as well. Our approach is powered by Semantics-to-Space (S2S) architecture where semantics derived from the residing objects are embedded into a spatial space. This architecture allows the co-capturing of the semantic attributes of the human and the objects along with their location/size/silhouette information. As this is the first attempt to address the zero-shot human-object-interaction inferencing with VO query, we have constructed a new dataset, Verb-Transferability 60 (VT60). VT60 provides 60 different VO pairs with overlapping verbs tailored for testing ZSL approaches with VO query. Experimental evaluations show that our approach not only outperforms the state-of-the-art, but also shows the capability of consistently improving performance regardless of which ZSL baseline architecture is used.

READ FULL TEXT

page 2

page 3

page 5

research
06/17/2022

Learning Using Privileged Information for Zero-Shot Action Recognition

Zero-Shot Action Recognition (ZSAR) aims to recognize video actions that...
research
12/02/2015

Zero-Shot Event Detection by Multimodal Distributional Semantic Embedding of Videos

We propose a new zero-shot Event Detection method by Multi-modal Distrib...
research
03/08/2018

Preserving Semantic Relations for Zero-Shot Learning

Zero-shot learning has gained popularity due to its potential to scale r...
research
07/28/2017

Spatial-Aware Object Embeddings for Zero-Shot Localization and Classification of Actions

We aim for zero-shot localization and classification of human actions in...
research
12/05/2017

Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Network

We propose a novel framework called Semantics-Preserving Adversarial Emb...
research
03/01/2023

Distilled Reverse Attention Network for Open-world Compositional Zero-Shot Learning

Open-World Compositional Zero-Shot Learning (OW-CZSL) aims to recognize ...
research
05/25/2023

Interactive Segment Anything NeRF with Feature Imitation

This paper investigates the potential of enhancing Neural Radiance Field...

Please sign up or login with your details

Forgot password? Click here to reset