DeepAI AI Chat
Log In Sign Up

Visual Semantic Information Pursuit: A Survey

by   Daqi Liu, et al.
University of Surrey

Visual semantic information comprises two important parts: the meaning of each visual semantic unit and the coherent visual semantic relation conveyed by these visual semantic units. Essentially, the former one is a visual perception task while the latter one corresponds to visual context reasoning. Remarkable advances in visual perception have been achieved due to the success of deep learning. In contrast, visual semantic information pursuit, a visual scene semantic interpretation task combining visual perception and visual context reasoning, is still in its early stage. It is the core task of many different computer vision applications, such as object detection, visual semantic segmentation, visual relationship detection or scene graph generation. Since it helps to enhance the accuracy and the consistency of the resulting interpretation, visual context reasoning is often incorporated with visual perception in current deep end-to-end visual semantic information pursuit methods. However, a comprehensive review for this exciting area is still lacking. In this survey, we present a unified theoretical paradigm for all these methods, followed by an overview of the major developments and the future trends in each potential direction. The common benchmark datasets, the evaluation metrics and the comparisons of the corresponding methods are also introduced.


page 2

page 3

page 7

page 8


Semantic Visual Simultaneous Localization and Mapping: A Survey

Visual Simultaneous Localization and Mapping (vSLAM) has achieved great ...

Indoor Scene Understanding in 2.5/3D: A Survey

With the availability of low-cost and compact 2.5/3D visual sensing devi...

Towards Accurate Scene Text Recognition with Semantic Reasoning Networks

Scene text image contains two levels of contents: visual texture and sem...

Learning Deep Representations for Semantic Image Parsing: a Comprehensive Overview

Semantic image parsing, which refers to the process of decomposing image...

Improving Information Extraction from Images with Learned Semantic Models

Many applications require an understanding of an image that goes beyond ...

Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition

Although text recognition has significantly evolved over the years, stat...

Not just a matter of semantics: the relationship between visual similarity and semantic similarity

Knowledge transfer, zero-shot learning and semantic image retrieval are ...