RL-CSDia: Representation Learning of Computer Science Diagrams

03/10/2021
by   Shaowei Wang, et al.
6

Recent studies on computer vision mainly focus on natural images that express real-world scenes. They achieve outstanding performance on diverse tasks such as visual question answering. Diagram is a special form of visual expression that frequently appears in the education field and is of great significance for learners to understand multimodal knowledge. Current research on diagrams preliminarily focuses on natural disciplines such as Biology and Geography, whose expressions are still similar to natural images. Another type of diagrams such as from Computer Science is composed of graphics containing complex topologies and relations, and research on this type of diagrams is still blank. The main challenges of graphic diagrams understanding are the rarity of data and the confusion of semantics, which are mainly reflected in the diversity of expressions. In this paper, we construct a novel dataset of graphic diagrams named Computer Science Diagrams (CSDia). It contains more than 1,200 diagrams and exhaustive annotations of objects and relations. Considering the visual noises caused by the various expressions in diagrams, we introduce the topology of diagrams to parse topological structure. After that, we propose Diagram Parsing Net (DPN) to represent the diagram from three branches: topology, visual feature, and text, and apply the model to the diagram classification task to evaluate the ability of diagrams understanding. The results show the effectiveness of the proposed DPN on diagrams understanding.

READ FULL TEXT

page 2

page 3

page 4

page 5

page 6

page 7

page 8

page 9

research
03/24/2016

A Diagram Is Worth A Dozen Images

Diagrams are common tools for representing complex concepts, relationshi...
research
12/29/2022

GPTR: Gestalt-Perception Transformer for Diagram Object Detection

Diagram object detection is the key basis of practical applications such...
research
12/05/2019

Classifying Diagrams and Their Parts using Graph Neural Networks: A Comparison of Crowd-Sourced and Expert Annotations

This article compares two multimodal resources that consist of diagrams ...
research
11/27/2017

Dynamic Graph Generation Network: Generating Relational Knowledge from Diagrams

In this work, we introduce a new algorithm for analyzing a diagram, whic...
research
03/08/2021

Semiotically-grounded distant viewing of diagrams: insights from two multimodal corpora

In this article, we bring together theories of multimodal communication ...
research
08/13/2021

Bayesian Modelling of Alluvial Diagram Complexity

Alluvial diagrams are a popular technique for visualizing flow and relat...
research
12/05/2017

Structured Set Matching Networks for One-Shot Part Labeling

Diagrams often depict complex phenomena and serve as a good test bed for...

Please sign up or login with your details

Forgot password? Click here to reset