IPC: A Benchmark Data Set for Learning with Graph-Structured Data

05/15/2019
by   Patrick Ferber, et al.
4

Benchmark data sets are an indispensable ingredient of the evaluation of graph-based machine learning methods. We release a new data set, compiled from International Planning Competitions (IPC), for benchmarking graph classification, regression, and related tasks. Apart from the graph construction (based on AI planning problems) that is interesting in its own right, the data set possesses distinctly different characteristics from popularly used benchmarks. The data set, named IPC, consists of two self-contained versions, grounded and lifted, both including graphs of large and skewedly distributed sizes, posing substantial challenges for the computation of graph models such as graph kernels and graph neural networks. The graphs in this data set are directed and the lifted version is acyclic, offering the opportunity of benchmarking specialized models for directed (acyclic) structures. Moreover, the graph generator and the labeling are computer programmed; thus, the data set may be extended easily if a larger scale is desired. The data set is accessible from <https://github.com/IBM/IPC-graph-data>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2022

Graph neural networks to learn joint representations of disjoint molecular graphs

Graph neural networks are widely used to learn global representations of...
research
10/26/2019

Understanding Isomorphism Bias in Graph Data Sets

In recent years there has been a rapid increase in classification method...
research
07/06/2020

Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks

We present Wiki-CS, a novel dataset derived from Wikipedia for benchmark...
research
12/18/2022

Synthesis and Evaluation of a Domain-specific Large Data Set for Dungeons Dragons

This paper introduces the Forgotten Realms Wiki (FRW) data set and domai...
research
05/31/2022

Graph-level Neural Networks: Current Progress and Future Directions

Graph-structured data consisting of objects (i.e., nodes) and relationsh...
research
12/02/2020

Spatial Multivariate Trees for Big Data Bayesian Regression

High resolution geospatial data are challenging because standard geostat...
research
09/26/2022

Digital Audio Forensics: Blind Human Voice Mimicry Detection

Audio is one of the most used way of human communication, but at the sam...

Please sign up or login with your details

Forgot password? Click here to reset