A Query Language for Summarizing and Analyzing Business Process Data
In modern enterprises, Business Processes (BPs) are realized over a mix of workflows, IT systems, Web services and direct collaborations of people. Accordingly, process data (i.e., BP execution data such as logs containing events, interaction messages and other process artifacts) is scattered across several systems and data sources, and increasingly show all typical properties of the Big Data. Understanding the execution of process data is challenging as key business insights remain hidden in the interactions among process entities: most objects are interconnected, forming complex, heterogeneous but often semi-structured networks. In the context of business processes, we consider the Big Data problem as a massive number of interconnected data islands from personal, shared and business data. We present a framework to model process data as graphs, i.e., Process Graph, and present abstractions to summarize the process graph and to discover concept hierarchies for entities based on both data objects and their interactions in process graphs. We present a language, namely BP-SPARQL, for the explorative querying and understanding of process graphs from various user perspectives. We have implemented a scalable architecture for querying, exploration and analysis of process graphs. We report on experiments performed on both synthetic and real-world datasets that show the viability and efficiency of the approach.
READ FULL TEXT