Understanding and representing the semantics of large structured documents

07/24/2018
by   Muhammad Mahbubur Rahman, et al.
0

Understanding large, structured documents like scholarly articles, requests for proposals or business reports is a complex and difficult task. It involves discovering a document's overall purpose and subject(s), understanding the function and meaning of its sections and subsections, and extracting low level entities and facts about them. In this research, we present a deep learning based document ontology to capture the general purpose semantic structure and domain specific semantic concepts from a large number of academic articles and business documents. The ontology is able to describe different functional parts of a document, which can be used to enhance semantic indexing for a better understanding by human beings and machines. We evaluate our models through extensive experiments on datasets of scholarly articles from arXiv and Request for Proposal documents.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2019

Unfolding the Structure of a Document using Deep Learning

Understanding and extracting of information from large documents, such a...
research
11/30/2017

Calculating Semantic Similarity between Academic Articles using Topic Event and Ontology

Determining semantic similarity between academic documents is crucial to...
research
03/11/2020

ConceptScope: Organizing and Visualizing Knowledge in Documents based on Domain Ontology

Current text visualization techniques typically provide overviews of doc...
research
09/03/2017

Understanding the Logical and Semantic Structure of Large Documents

Current language understanding approaches focus on small documents, such...
research
09/26/2017

Object-oriented Neural Programming (OONP) for Document Understanding

We propose Object-oriented Neural Programming (OONP), a framework for se...
research
11/26/2019

Doc2Vec on the PubMed corpus: study of a new approach to generate related articles

PubMed is the biggest and most used bibliographic database worldwide, ho...
research
09/02/2020

Tree Automata for Extracting Consensus from Partial Replicas of a Structured Document

In an asynchronous cooperative editing workflow of a structured document...

Please sign up or login with your details

Forgot password? Click here to reset