On the External Validity of Average-Case Analyses of Graph Algorithms

05/30/2022
by   Thomas Bläsius, et al.
0

The number one criticism of average-case analysis is that we do not actually know the probability distribution of real-world inputs. Thus, analyzing an algorithm on some random model has no implications for practical performance. At its core, this criticism doubts the existence of external validity, i.e., it assumes that algorithmic behavior on the somewhat simple and clean models does not translate beyond the models to practical performance real-world input. With this paper, we provide a first step towards studying the question of external validity systematically. To this end, we evaluate the performance of six graph algorithms on a collection of 2751 sparse real-world networks depending on two properties; the heterogeneity (variance in the degree distribution) and locality (tendency of edges to connect vertices that are already close). We compare this with the performance on generated networks with varying locality and heterogeneity. We find that the performance in the idealized setting of network models translates surprisingly well to real-world networks. Moreover, heterogeneity and locality appear to be the core properties impacting the performance of many graph algorithms.

READ FULL TEXT

page 3

page 9

page 10

page 13

page 25

page 33

page 36

page 37

research
05/29/2019

Understanding the Effectiveness of Data Reduction in Public Transportation Networks

Given a public transportation network of stations and connections, we wa...
research
04/12/2023

Locality via Global Ties: Stability of the 2-Core Against Misspecification

For many random graph models, the analysis of a related birth process su...
research
11/10/2021

LSP : Acceleration and Regularization of Graph Neural Networks via Locality Sensitive Pruning of Graphs

Graph Neural Networks (GNNs) have emerged as highly successful tools for...
research
05/21/2023

Exploring and Exploiting Data Heterogeneity in Recommendation

Massive amounts of data are the foundation of data-driven recommendation...
research
12/18/2020

Fast and Efficient Parallel Breadth-First Search with Power-law Graph Transformation

In the big data era, graph computing is widely used to exploit the hidde...
research
08/10/2019

Classical Information Theory of Networks

Heterogeneity is among the most important features characterizing real-w...

Please sign up or login with your details

Forgot password? Click here to reset