Efficient Graph Compression Using Huffman Coding Based Techniques

06/15/2018 ∙ by Rushabh Jitendrakumar Shah, et al. ∙ 0

Graphs have been extensively used to represent data from various domains. In the era of Big Data, information is being generated at a fast pace, and analyzing the same is a challenge. Various methods have been proposed to speed up the analysis of the data and also mining it for information. All of this often involves using a massive array of compute nodes, and transmitting the data over the network. Of course, with the huge quantity of data, this poses a major issue to the task of gathering intelligence from data. Therefore, in order to address such issues with Big Data, using data compression techniques is a viable option. Since graphs represent most real world data, methods to compress graphs have been in the forefront of such endeavors. In this paper we propose techniques to compress graphs by finding specific patterns and replacing those with identifiers that are of variable length, an idea inspired by Huffman Coding. Specifically, given a graph G = (V, E), where V is the set of vertices and E is the set of edges, and |V| = n, we propose methods to reduce the space requirements of the graph by compressing the adjacency representation of the same. The proposed methods show up to 80 the space required to store the graphs as compared to using the adjacency matrix. The methods can also be applied to other representations as well. The proposed techniques help solve the issues related to computing on the graphs on resources limited compute nodes, as well as reduce the latency for transfer of data over the network in case of distributed computing.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.