In-database connected component analysis

02/26/2018 ∙ by Harald Bögeholz, et al. ∙ 0

We describe a Big Data-practical, SQL-implementable algorithm for efficiently determining connected components for graph data stored in a Massively Parallel Processing (MPP) relational database. The algorithm described is a linear-space, randomised algorithm, always terminating with the correct answer but subject to a stochastic running time, such that for any ϵ>0 and any input graph G=〈 V, E 〉 the algorithm terminates after O( |V|) SQL queries with probability of at least 1-ϵ, which we show empirically to translate to a quasi-linear runtime in practice.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.