On Bi-gram Graph Attributes

07/05/2021
by   Thomas Konstantinovsky, et al.
0

We propose a new approach to text semantic analysis and general corpus analysis using, as termed in this article, a "bi-gram graph" representation of a corpus. The different attributes derived from graph theory are measured and analyzed as unique insights or against other corpus graphs. We observe a vast domain of tools and algorithms that can be developed on top of the graph representation; creating such a graph proves to be computationally cheap, and much of the heavy lifting is achieved via basic graph calculations. Furthermore, we showcase the different use-cases for the bi-gram graphs and how scalable it proves to be when dealing with large datasets.

READ FULL TEXT
research
06/24/2018

N-Gram Graph, A Novel Molecule Representation

Virtual high-throughput screening provides a strategy for prioritizing c...
research
10/09/2022

Noise-Robust De-Duplication at Scale

Identifying near duplicates within large, noisy text corpora has a myria...
research
07/27/2020

Next word prediction based on the N-gram model for Kurdish Sorani and Kurmanji

Next word prediction is an input technology that simplifies the process ...
research
12/09/2010

MUDOS-NG: Multi-document Summaries Using N-gram Graphs (Tech Report)

This report describes the MUDOS-NG summarization system, which applies a...
research
12/19/2017

Any-gram Kernels for Sentence Classification: A Sentiment Analysis Case Study

Any-gram kernels are a flexible and efficient way to employ bag-of-n-gra...
research
12/19/2014

Purine: A bi-graph based deep learning framework

In this paper, we introduce a novel deep learning framework, termed Puri...
research
10/28/2021

Finding a Concise, Precise, and Exhaustive Set of Near Bi-Cliques in Dynamic Graphs

A variety of tasks on dynamic graphs, including anomaly detection, commu...

Please sign up or login with your details

Forgot password? Click here to reset