A Unified Model for Video Understanding and Knowledge Embedding with Heterogeneous Knowledge Graph Dataset

11/19/2022
by   Jiaxin Deng, et al.
0

Video understanding is an important task in short video business platforms and it has a wide application in video recommendation and classification. Most of the existing video understanding works only focus on the information that appeared within the video content, including the video frames, audio and text. However, introducing common sense knowledge from the external Knowledge Graph (KG) dataset is essential for video understanding when referring to the content which is less relevant to the video. Owing to the lack of video knowledge graph dataset, the work which integrates video understanding and KG is rare. In this paper, we propose a heterogeneous dataset that contains the multi-modal video entity and fruitful common sense relations. This dataset also provides multiple novel video inference tasks like the Video-Relation-Tag (VRT) and Video-Relation-Video (VRV) tasks. Furthermore, based on this dataset, we propose an end-to-end model that jointly optimizes the video understanding objective with knowledge graph embedding, which can not only better inject factual knowledge into video understanding but also generate effective multi-modal entity embedding for KG. Comprehensive experiments indicate that combining video understanding embedding with factual knowledge benefits the content-based video retrieval performance. Moreover, it also helps the model generate better knowledge graph embedding which outperforms traditional KGE-based methods on VRT and VRV tasks with at least 42.36 improvement in HITS@10.

READ FULL TEXT

page 4

page 7

research
04/18/2022

TranS: Transition-based Knowledge Graph Embedding with Synthetic Relation Representation

Knowledge graph embedding (KGE) aims to learn continuous vectors of rela...
research
05/01/2020

HLVU : A New Challenge to Test Deep Understanding of Movies the Way Humans do

In this paper we propose a new evaluation challenge and direction in the...
research
04/23/2023

Modality-Aware Negative Sampling for Multi-modal Knowledge Graph Embedding

Negative sampling (NS) is widely used in knowledge graph embedding (KGE)...
research
11/28/2019

Product Knowledge Graph Embedding for E-commerce

In this paper, we propose a new product knowledge graph (PKG) embedding ...
research
08/20/2020

VisualSem: a high-quality knowledge graph for vision and language

We argue that the next frontier in natural language understanding (NLU) ...
research
02/27/2022

Concept Graph Neural Networks for Surgical Video Understanding

We constantly integrate our knowledge and understanding of the world to ...
research
10/28/2022

Kuaipedia: a Large-scale Multi-modal Short-video Encyclopedia

Online encyclopedias, such as Wikipedia, have been well-developed and re...

Please sign up or login with your details

Forgot password? Click here to reset