Log-based Evaluation of Label Splits for Process Models

06/23/2016 ∙ by Niek Tax, et al. ∙ TU Eindhoven 0

Process mining techniques aim to extract insights in processes from event logs. One of the challenges in process mining is identifying interesting and meaningful event labels that contribute to a better understanding of the process. Our application area is mining data from smart homes for elderly, where the ultimate goal is to signal deviations from usual behavior and provide timely recommendations in order to extend the period of independent living. Extracting individual process models showing user behavior is an important instrument in achieving this goal. However, the interpretation of sensor data at an appropriate abstraction level is not straightforward. For example, a motion sensor in a bedroom can be triggered by tossing and turning in bed or by getting up. We try to derive the actual activity depending on the context (time, previous events, etc.). In this paper we introduce the notion of label refinements, which links more abstract event descriptions with their more refined counterparts. We present a statistical evaluation method to determine the usefulness of a label refinement for a given event log from a process perspective. Based on data from smart homes, we show how our statistical evaluation method for label refinements can be used in practice. Our method was able to select two label refinements out of a set of candidate label refinements that both had a positive effect on model precision.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Process mining is a fast growing discipline that brings together knowledge and techniques from computational intelligence, data mining, process modeling and process analysis Aalst2011 . The process mining task is the automatic or semi-automatic analysis of events that are logged during process execution, where event records contain information on what was done, by whom, for whom, where, when, etc. Events are grouped into cases (process instances), e.g. per patient for a hospital log, or per insurance claim for an insurance company. An important task within process mining is process discovery, which focuses on extracting interpretable models of processes from event logs. One of the attributes of the events is usually used as its label. These event labels are then used as transition/activity labels in the process models created by process discovery algorithms.

Id Timestamp Address Sensor Heart rate Activity
1 03/11/2015 02:45 Mountain Rd. 7 Bedroom motion 74 Tossing & turning
2 03/11/2015 03:23 Mountain Rd. 7 Bedroom motion 72 Tossing & turning
3 03/11/2015 04:59 Mountain Rd. 7 Bedroom motion 71 Tossing & turning
4 03/11/2015 06:04 Mountain Rd. 7 Bedroom motion 73 Tossing & turning
5 03/11/2015 08:45 Mountain Rd. 7 Bedroom motion 85 Getting up
6 03/11/2015 09:10 Mountain Rd. 7 Living room motion 79 Living room motion
03/11/2015 Mountain Rd. 7
7 03/12/2015 01:01 Mountain Rd. 7 Bedroom motion 73 Tossing & turning
8 03/12/2015 03:13 Mountain Rd. 7 Bedroom motion 75 Tossing & turning
9 03/12/2015 07:24 Mountain Rd. 7 Bedroom motion 74 Tossing & turning
10 03/12/2015 08:34 Mountain Rd. 7 Bedroom motion 79 Getting up
11 03/12/2015 09:12 Mountain Rd. 7 Living room motion 76 Living room motion
03/12/2015 Mountain Rd. 7
12 03/13/2015 00:45 Mountain Rd. 7 Bedroom motion 75 Tossing & turning
13 03/13/2015 02:29 Mountain Rd. 7 Bedroom motion 75 Tossing & turning
14 03/13/2015 05:19 Mountain Rd. 7 Bedroom motion 74 Tossing & turning
15 03/13/2015 05:34 Mountain Rd. 7 Bedroom motion 79 Tossing & turning
16 03/13/2015 05:39 Mountain Rd. 7 Bedroom motion 77 Tossing & turning
17 03/13/2015 08:37 Mountain Rd. 7 Bedroom motion 79 Getting up
18 03/13/2015 08:52 Mountain Rd. 7 Living room motion 78 Living room motion
03/13/2015 Mountain Rd. 7
19 03/14/2015 03:41 Mountain Rd. 7 Bedroom motion 75 Tossing & turning
20 03/14/2015 05:00 Mountain Rd. 7 Bedroom motion 74 Tossing & turning
21 03/14/2015 08:52 Mountain Rd. 7 Bedroom motion 75 Getting up
22 03/14/2015 09:30 Mountain Rd. 7 Living room motion 74 Living room motion
03/14/2015 Mountain Rd. 7
23 03/15/2015 02:11 Mountain Rd. 7 Bedroom motion 77 Tossing & turning
24 03/15/2015 02:34 Mountain Rd. 7 Bedroom motion 76 Tossing & turning
25 03/15/2015 08:35 Mountain Rd. 7 Bedroom motion 79 Getting up
26 03/15/2015 08:57 Mountain Rd. 7 Living room motion 77 Living room motion
03/15/2015 Mountain Rd. 7
Table 1: The corresponding smart home sensor event log with refined labels

Process mining takes its roots in the field of business process management, where the definition of labels for events is considered to be rather straightforward. In recent years, the application domain of process mining has broadened. A wide variety of event types can be used as input and analysis may be challenging. One of the most challenging application areas is LifeLogging, which focuses on acquisition and analysis of personal daily life data. LifeLogs amongst others combine data collected through mobile phones, wearable devices, and/or smart home sensors. The emergence of LifeLogging tools and the resulting increase in availability of activity data enable a process-centric analysis of human behaviorSztyler2015 . The aim of process mining analysis on LifeLogging data is to find frequent activity patterns and represent them in a human interpretable process model. Such a process model could then also be used to detect deviations from one’s regular behavior. Process mining in the human behavior application domain closely relates to the field of activity recognition, which aims to detect human activities from sensors and finding patterns between human activities Chen2012 . Process mining, however, aims to produce interpretable models that can provide insights by visually inspecting them. In contrast, most activity recognition techniques produce non-interpretable models.

Imagine an elderly person of whom we want to discover a process model describing his/her daily behavior. Events are generated by sensors, either periodically (e.g. by a temperature sensor or heart rate monitor), or triggered by some activity (e.g. motion). Table 1 shows an example log obtained by fusing data from such sensors. The dots indicate that only a fraction of the logged events are shown. Assigning meaningful labels to these events is not straightforward. A Bedroom motion event can be caused by different human activities, e.g. by Tossing & turning or by Getting up. In some cases it is necessary to distinguish between Tossing & turning and Getting up, for example when we aim to generate a timely reminder to take medication that needs to be taken before breakfast. Based on contextual information (e.g. a specific increase in heart rate, a time stamp, etc.), the distinction between the two types of activities might be identified, and each event with label Bedroom motion can be refined into either Tossing & turning or Getting up. The last column in Table 1 shows the desired event labels. Figure 1 shows a process model that can be deduced from such a log using existing process discovery techniques, like the ones from Aalst2004 ; Weijters2011 .

Figure 1: A Petri net derived from the event log in Table 1

Many relabelings of Bedroom motion

events are possible. Expert knowledge, data mining or machine learning techniques can be used to generate ideas for potential labeling functions. The goal of this labeling function is to give “similar” events the same label. However, similarity is a relative notion,

so the initially chosen labeling function can be too abstract or too fine-grained to generate an informative process model. Once a process discovery algorithm has been applied and a process model is obtained, one can assess whether the labeling function used on the original event log allowed the process discovery algorithm to discover an informative process model. However, it is computationally costly to apply process mining algorithms to multiple event logs generated from a single original event log using different event labeling functions with varying levels of abstraction. Therefore, we provide a statistical approach to evaluate label refinement usefulness in the context of process discovery that is based on significance testing of differences in event ordering relations.

The Fodina Broucke2014 and the Li2007

process discovery algorithms assume that there is one column in the event log that indicates the activity and refine this label based on a threshold of differentness on the event labels occurring directly before and after. In this paper we assume that the information what activity is performed is spread over multiple columns. We choose one column as primary activity column and refine the activity labels based on the other columns and temporal information. We validate whether a refinement makes sense from a process perspective by taking into account all temporal event information in the event log, using statistical testing and information gain. Evaluating splits based on information gain is a well-known approach in the area of decision tree learning

Quinlan2014 , where ground truth labels are available in contrast to the label refinement setting. Label refinements draw similarities with automatic learning of ontologies Maedche2012 in the sense that both are concerned with inferring multiple levels of semantic interpretations from data. Ontology quality evaluation techniques Brank2005 can be used to evaluate (automatically inferred) ontologies, however these techniques are not process-centric, i.e., they do not take into account ordering relations between elements of the ontology in execution sequences.

Section 2 gives formal definitions of label refinements, process models, and related concepts. In Section 3, we discuss when a label refinement is useful from a process mining perspective. A statistical method to evaluate the usefulness of a label refinement is described in Section 4. In Section 5 we discuss the results of the proposed method on a real life smart home data set. We draw conclusions in Section 6.

2 Label Refinements & Process Models

In this section we introduce the notions related to event logs and relabeling functions for traces and then define the notions of refinements and abstractions. We also introduce the Petri net process model notation.

We use the usual sequence definition, and denote a sequence by listing its elements, e.g. we write for a (finite) sequence of elements from some alphabet , where for any . The length of a sequence is ; denotes the concatenation of sequences and . A language over an alphabet is a set of sequences over . is the prefix closure of a language (with ).

An event is the most elementary element of an event log. Let be a set of event identifiers, be a set of timestamps, and be an attribute domain consisting of attributes (e.g. resource, activity name, cost, etc.), each of a certain type. An event is a tuple , with , , and . The event label of an event is the attribute set ; , and respectively denote the identifier, the timestamp and label of event . is a universe of events over . The lines of Table 1, where we do not consider the activity column for now, are events from an event universe over the event attributes sensor, address, and heart rate.

Events are often considered in the context of other events. We call an event set, if does not contain any events with the same event identifier. The events in Table 1 together form an event set. A trace is a finite sequence formed by the events from an event set that respects the time ordering of events, i.e. for all , , we have: . We define the universe of traces over event universe , denoted , as the set of all possible traces over . We omit in and use the shorter notation when the event universe is clear from the context.

Often it is useful to partition an event set into smaller sets in which events belong together according to some criterion. We might for example be interested in discovering the typical behavior of households over the course of a day. In order to do so, we can e.g. group together events with the same address and the same day-part of the timestamp, as indicated by the horizontal lines in Table 1. For each of these event sets, we can construct a trace; time stamps define the ordering of events within the trace. For events of a trace having the same time stamps, an arbitrary ordering can be chosen within a trace.

An event partitioning function is a function that defines the partitioning of an arbitrary set of events from a given event universe into event sets where each is the maximal subset of such that for any , ; the value of shared by all the elements of defines the value of the trace attribute . Note that complex, multidimensional trace attributes are also possible, i.e. a combination of the name of the person performing the event activity and the date of the event, so that every trace contains activities of one person during one day. The event sets obtained by applying an event partitioning can be transformed into traces (respecting the time ordering of events).

An event log is a finite set of traces . denotes the alphabet of event labels that occur in log . The traces of a log are often transformed before doing further analysis: very detailed but not necessarily informative event descriptions are transformed into some informative and repeatable labels. For the labels of the log in Table 1, the heart rate values can be abstracted to low, normal, and high or the label can be redefined to a subset of the event attributes. Next to that, if the event partitioning function maps each event from Table 1 to its address and the day-part of the timestamp, these attributes (indicated in gray) become the trace attribute and can safely be removed from individual events. The new label is then defined as a combination of the sensor and abstracted heart rate values.

After this relabeling step, some traces of the log can become identically labeled (the event id’s would still be different). The information about the number of occurrences of a sequence of labels in an event log is highly relevant for process mining, since it allows differentiating between the main stream behavior of a process (frequently occurring behavioral patterns) and exceptional behavior.

Let and be two universes of traces defined over event universes . A function is a trace relabeling function if for all traces such that if is a prefix of , is a prefix of or equal to . We lift to event logs: for , the relabeling is defined as .

Often, relabeling functions are defined using a more narrow approach: first defining an event relabeling function and then lifting that function to traces. In the context of business processes, event relabeling functions are mostly mere projections of events on the values of a single attribute, such as activity name. We consider a more general definition to allow for history-dependent interpretation of events, which is necessary in the context of LifeLogging. Prefix preservation requirement is necessary to allow for logging, compliance checking and other forms of analysis performed at run time.

Let , , and be trace universes over respectively with being pairwise different. Let and be trace relabeling functions. Relabeling function is a refinement of relabeling function , denoted by , iff ; is then called an abstraction of . We call a refinement of a strict refinement, denoted by , when . We call refinement of an equal length refinement, denoted by ,when .

Let be trace universes over respectively, a trace relabeling function, and be a language over . Trace concretization is a function defined as , for each . Language concretization of is language .

The goal of process discovery is to discover a process model that represents the behavior seen in an event log. A frequently used process modeling notation in the process mining field is the Petri net Reisig1998 . Petri nets are directed bipartite graphs consisting of transitions and places, connected by arcs. Transitions represent activities, while places represent the enabling conditions of transitions. Labels are assigned to transitions to indicate the type of activity that they model. A special label is used to represent invisible transitions, which are only used for routing purposes and not recorded in the execution log.

A labeled Petri net is a tuple where is a finite set of places, is a finite set of transitions such that , is a set of directed arcs, called the flow relation, is an alphabet of labels representing activities, with being a label representing invisible events, and is a labeling function that assigns a label to each transition. For a node we use and to denote the set of input and output nodes of , defined as and . An example of a Petri net can be seen in Figure 1, where circles represent places and squares represent transitions.

A state of a Petri net is defined by its marking being a multiset of places. A marking is graphically denoted by putting tokens on each place . A pair is called a marked Petri net. State changes occur through transition firings. A transition is enabled (can fire) in a given marking if each input place contains at least one token. Once a transition fires, one token is removed from each input place of and one token is added to each output place of , leading to a new marking defined as . A firing of a transition leading from marking to marking is denoted as . indicates that can be reached from through a firing sequence . Many process modeling notations have formal executional semantics and define a language of accepting traces . For Petri net in Figure 2, .

3 On the Quality of Label Refinements for Process Mining

(a) Petri net

(b) Petri net
Figure 2: Petri nets discovered from two event logs obtained from the same event set with different relabeling functions.

Event Set

Event Log

Trace attribute

Event Log =

Relabeling function

Event Log =

Relabeling function

Process Model

Process Discovery

Process Model

Process Discovery








Figure 3: Comparing two event relabeling functions

Process discovery algorithms discover a process model based on an event log, where event labels are obtained by applying an event relabeling function to an original log. The main quality metrics discovered process models are fitness, precision, generalization and simplicity Aalst2011 . Fitness represents the share of the behavior seen in the log that is allowed by the process model. Precision aims at narrowing the set of traces that belong to the language of the discovered process model, but was not observed in the event log. Generalization aims at preventing overfitting, and simplicity measures the “understandability” and “well-structuredness” of models.

Intuitively, an event relabeling function is better than another one if it improves the quality of the discovered model along these quality dimensions. However, the quality metrics are currently defined in such a way that only results of discovery algorithms applied to the very same log can be compared, while two different relabeling functions produce logs with different event labels. The Petri net in Figure 2 has perfect precision and fitness for the event log with labels as shown in the refined label column of Table 1. At the same time, Petri net has perfect fitness and precision for the event log with labels as in the sensor column of Table 1. However, Petri net is useful for the purpose of sending a reminder message to take medicines after getting up, while Petri net is not. This suggests that Petri net is more precise than , but only with respect to the original log. Thus we have to make the comparison in the context of the original log. Suppose we have a set of events , which is part of some universe of events . We choose a case identifier and build an event log from . Then we choose relabeling functions and with and obtain and (see Figure 3). Applying process discovery to and results in two process models, which respectively accept languages and . These languages cannot be compared directly, since they contain traces consisting of different event labels. Precision metrics look at “redundant” traces in the mined models with respect to the log used as input for the discovery algorithm (see e.g. Munoz2010 ; Rozinat2008 ). Using the inverse functions , , every trace of and can be mapped to a set of traces built from the events from . Taking the union of the sets obtained with , over the traces of the languages, we obtain comparable languages and can conclude whether the relabeling function results in a model that is more precise with respect to the original log.

Fitness and simplicity of the models depend mostly on the performance of the process discovery algorithm, and not on the choice of the relabeling function. Precision defined in terms of events of the original universe of events is however highly dependent on the appropriateness of the relabeling function: choosing a more refined relabeling function can increase the precision by eliminating the behavior that would be allowed in the model discovered with a more abstract relabeling function. Generalization can potentially suffer as the result of a higher precision.

3.1 Label Refinement Quality

The comparison of the languages generated by models is not feasible due to its complexity; for many classes of process models, including Petri nets, the problem of language inclusion is just not decidable. Therefore, we need a different, practical approach to deciding on the usefulness of a relabeling function refinement. We start with discussing the usefulness by comparing the discovered models.

Consider event log , relabeling functions such that , and event logs . Let the in Figure 4 be the Petri nets obtained by applying process discovery to respectively. The square inside the transition between places and indicates that it is a subprocess.



Figure 4: is a non-useful refinement and is a useful refinement of .

We can see that refinement does not lead to a meaningful interpretation of as and , since the behavior of the model is not related to the choice between and : transitions labeled with and have the same input and output places. Refinement does not provide new insight and unnecessarily harms the understandability of the Petri net by creating more transitions then needed. On the other hand, results in gain of precision, as , does not contain and , while does not distinguish between and , which suggests that both types of traces are possible.

4 Evaluation Method for Label Refinements for Process Models

In the previous section we showed that we can compare the usefulness of a label refinement by inspecting the Petri net obtained with process discovery. A naive way to evaluate label refinement would be to apply process discovery to all possible label refinements. The number of possible label refinements to consider can however be large and process discovery is a computationally expensive task. Therefore, this naive approach quickly becomes computationally infeasible. We now present a way to estimate the usefulness of a label refinement based on statistics and log relations.

Algorithm 1 shows the steps of the label refinements evaluation method. The evaluation method consists of an entropy-based component that measures whether a label refinement makes the log statistics more unbalanced, and a statistical test that tests whether there is a label statistic that tests whether the label refinement makes a statistically significant difference to at least one of the log statistics. In the following two sections we described the entropy-based measure and the statistical testing respectively.

4.1 Log Statistics

Event ordering patterns are crucial to most process discovery algorithms. Table 2 provides an overview of well-known log-based ordering relations described in process discovery literature Aalst2004 ; Dongen2004 ; Wen2006 ; Weijters2011 and provides examples. Let be an event log. Let . Formal definitions of these log-based ordering statistics are as follows:

  • is the number of occurrences of in the traces of that are directly followed by , i.e. in some we have and (direct successor), is the number of occurrences of which are not directly followed by ;

  • and is the number of occurrences of that are, respectively, are not, followed by : for a trace and , and and and (length-two loops);

  • and is the number of occurrences of that are, respectively are not, eventually followed by : for a trace with , and (direct or indirect successor).

In the general sense, let and be the count of the number of ’s that do, respectively do not, satisfy relation in log with respect to .

Ordering relation Miners using the relation
Direct successor miner Aalst2004 , miner Wen2006 , Multi-phase miner Dongen2004

, Heuristics miner

Length-two loop miner Wen2006 , Multi-phase miner Dongen2004 , Heuristics miner Weijters2011
Direct/indirect successor miner Wen2006 , Heuristics miner Weijters2011
Table 2: Log-based ordering relations and their use by process discovery algorithms
Table 3:

A Log statistic in contingency table form

Let be an event log. Let and be two relabeling functions that are to be compared, such that . Let and . Let and have the property , that is, refines activity into distinct activities and . The difference in control flow between and can be expressed as the dissimilarity in log-based ordering statistics between event label and on the one hand, and and on the other hand. Each log-based ordering statistics of and with regard to any other activity can be formulated in the form of a contingency table, as shown in Table 3.

4.2 Information Gain

The binary entropy function, , where , is a measure of uncertainty. Applied on a log statistic, the binary entropy function represents a degree of nondeterminism. Nondeterministic, unbalanced, log statistics are a helpful to process discovery algorithms that operate of log statistics, as it provides low uncertainty to the mining algorithm. Low entropy in the log statistics indicate high predictability of the process, making it easier for process discovery algorithms to return a sensible process model.

Consider the contingency tables in Table 4, based on log statistics obtained from Table 1 between the events labeled Tossing & turning and Getting up and the events labeled Living room motion. On the right hand side of the table, separated by the bar, are the log statistics of the before-split label in the before-split log. All five events with label Getting up directly precede an event with label Living room motion, while all sixteen events with label Tossing & turning are not directly preceded by Living room motion. Furthermore, all events with refined labels do not directly or eventually follow an event with label Living room motion, and all events with refined labels do eventually precede an event with label Living room motion.

Log statistics with a high degree of non-determinism, like the directly precedes statistic of the bedroom motion events before the split, might confuse a mining algorithm as there is no clear structure here: the Bedroom motion event might directly precede Livingroom motion, but most of the time it does not. After the split we see a completely deterministic directly precedes statistic, where Tossing & turning never and Getting up always directly precedes Livingroom motion. This increased determinism is reflected by the entropy of the directly precedes statistic before and after the split. Before the split we have bit of entropy in the directly precedes statistic, compared to bit of entropy for Tossing & turning and bit of entropy for Getting up. The conditional entropy of the log statistic after the split is the weighted average of the entropy of the labels created in the split, which is . The information gain of this label split with regard to the directly precedes Livingroom motion statistic is equal to the total entropy of the log statistic prior to the split, minus the conditional entropy after the split, this . Relative information gain Kullback1951 is a metric that provides insight in the ratio of bits of entropy reduced by a refinement, and can be calculated by dividing the information gain by the before-split entropy. The relative information gain of the directly precedes Livingroom motion statistic is . Figure 2 shows the effect of this label refinement on the resulting Petri net obtained by process discovery.

So far we have calculated the Relative information gain for a single log statistic. A label refinement however can have impact on multiple log statistics at once. We need a measure that integrates the information gain values of all log statistics to express the quality of a label refinement with respect to the determinism of the log statistics. We therefore sum over the entropy of all log statistics before the label split to obtain the total before-split entropy. We sum over the conditional entropies of all log statistics after the label split to obtain the total after-split entropy. Information Gain and Relative information gain are calculated as before. We let be the function that returns the Relative information gain based on the pre-split log and post-split log , where the set of refined label pairs in from which the log statistics are used corresponds to , with the the corresponding label in .

Tossing & turning Getting up Bed- room motion
0 0 0
16 5 21
Tossing & turning Getting up Bed- room motion
0 5 5
16 0 16
Tossing & turning Getting up Bed- room motion
0 0 0
16 5 21
Tossing & turning Getting up Bed- room motion
16 5 21
0 0 0
Table 4: Contingency tables for comparing the behavior of the two refined labels

Input: Event log , Relabeling functions and such that ,
Output: the Relative information gain of w.r.t ,
Parameters: Set of log-based ordering statistics ,
Significance level .

  • all_significant_different = true; =; =;

  • split_set = ;

  • For each split_set:

  • = false;

  • For each :

  • For each :

  • p = ;

  • If() pair_significant_different = true;

  • If(!)

  • = false;

  • If()

  • return ;

  • Else return ;

Algorithm 1 Algorithm of the label refinement statistical evaluation method

4.3 Statistical Testing

Relative information gain can be high by chance for a refinement when the generated refined labels are infrequent. Statistical testing of log statistic differences in addition to calculating relative information gain enables us to distinguish between information gain obtained by chance and actual information gain. Fisher’s exact test Fisher1934 is a statistical significance test for the analysis of contingency tables. When applied to the table above, it calculates a

-value for the null hypothesis that

and events are equally likely to hold log relation with regard to label . Fisher’s exact test assumes individual observations to be independent and row and column totals to be fixed. Independence of individual observations might be affected by the grouping of events in traces. In this paper we consider individual observations independence to be working assumption. The test was designed for experiments where both the row and column totals where conditioned. In our setting, the column totals are conditioned by the relabeling function, as the number of events of each label depends on the relabeling. The row totals however, are not conditioned and are an observation. Fisher’s exact test is not strictly speaking exact when one or both of the row or column totals are unconditioned, but will instead be slightly conservative McDonald2009

, meaning that the probability of the p-value being less than or equal to the significance level when the null hypothesis is true is less than the significance level. Fisher’s exact test is computationally expensive for large numbers of observations. For large sample sizes, either the

test of independence or the G-test of independence can be used, which are both found to be inaccurate for small sample sizes. A popular guideline is to not use the test of independence or the G-test for samples sizes less than one thousand McDonald2009 . The computational complexity of the evaluation procedure is . Many process discovery algorithms are exponential in the number of labels Aalst2012 . Based on this we can conclude that statistical evaluation of label refinements is computationally less expensive than checking label refinement usefulness through process discovery.

4.4 Correcting for Multiple Testing

The computational complexity indicates the number of hypothesis tests performed. When a large set of potential label refinements is evaluated, the evaluation method described is susceptible to the repeated testing problem. The larger the set of hypotheses tested, the higher the probability of incorrectly rejecting the null hypothesis in at least one of the hypothesis tests. Applying a Bonferroni correction Dunn1959 ; Dunn1961 to the hypothesis tests performed in the statistical evaluation method of label refinements keeps the familywise error rate constant.

4.5 Example Case

Consider the event log in Table 1 and imagine a scenario where a home care worker knows from experience that the elderly always sets his alarm clock at : AM. Based on such expert knowledge we are able to define a label refinement such that all bedroom movements after : AM are considered as Getting up events, while all other bedroom movements are considered to be Tossing & turning events. The rightmost column shows the refined labels obtained through this expert relabeling function. To evaluate the usefulness of this label refinement from a process model point of view, we apply the statistical evaluation method described in Section 4. As parameters we set the significance level threshold to the frequently used value of .

Log statistic P-value
Directly follows
Directly precedes
Eventually follows
Eventually precedes
Table 5: Results of the statistical tests for the evaluation of label refinement usefulness

Table 5 shows the outcome of the statistical tests performed as part of the label refinement usefulness evaluation. Four hypothesis tests have been performed, after Bonferroni correction each hypothesis test is tested at significance level . The direct following statistic of Tossing & turning and Getting up with Living room motion is statistically significantly given this significance level. The label refinement constructed with expert knowledge is found to be a useful label refinement through statistical evaluation.

5 Real life evaluation

We apply our label refinement evaluation method to a set of candidate label refinements on the Van Kasteren smart home environment data set Kasteren2008 in order to illustrate the effects of label splits in the context of process mining of real life processes. The van Kasteren data set consists of 1285 events divided over fourteen different sensors. Events are segmented in days from midnight to midnight, to define cases in the event log. The candidate set of label refinements consists of splitting each of the fourteen event types into two event types based on the their time in the day, such that events where the time since the start of the day is smaller than the median for are separated from events where it is equal to or larger than the median. Figure 5 shows the dependency graph obtained with the Heuristics Miner Weijters2011 . A dependency graph depicts causal relations between activities that meet a certainty threshold. A dependency graph can be directly converted into a Petri net Weijters2011 , however, for the sake of readability we included the dependency graphs instead of the Petri nets. The precision Munoz2010 of the Petri net corresponding to Figure 5 is 0.56 on a scale from 0 to 1.

Figure 5: Heuristics net showing original van Kasteren data set
Figure 6: Heuristics net showing the label refinement on hall-bathroom door on the van Kasteren data set
Figure 7: Heuristics net showing the label refinement on cups cupboard on the van Kasteren data set

Out of the fourteen candidate label refinements, two label refinements are selected by our approach. The first label refinement found is the split of Hall-bathroom door into Hall-bathroom door_1 and Hall-bathroom door_2, with a timestamp below, respectively above or equal to the median time in the day of Hall-bathroom door events. The resulting labels of this refinement are statistically significantly different in terms of their eventually follows relation with Front door (p-value: ) and their eventually follows relation with Plates cupboard (p-value: ) and Microwave . The relative information gain on the whole event log caused by this label refinement is 3.47%. Figure 6 shows a Heuristics Net mined with the Heuristics Miner Weijters2011 on the van Kasteren log with the refined Hall-bathroom door label. The model discovered on the log with this label refinement (Figure 6) has a precision of 0.69, up from 0.53 without the refinement. The increased precision shows that the label refinement helps restricting the share of behavior allowed by the model that is not covered by the event log. The second label refinement found is the split of Cups cupboard into Cups cupboard_1 and Cups cupboard_2. The resulting labels of this refinement are statistically significantly different in terms of their eventually precedes relation with Groceries cupboard (p-value: ) and their eventually follows relation with Fridge (p-value: ). The relative information gain on the whole event log caused by this label refinement is 0.53%. Figure 7 shows a Heuristics Net mined with the Heuristics Miner on the van Kasteren log with the refined Cups cupboard label, of which the precision is 0.61, up to 0.53 without the refinement. The label refinement with higher information gain also results in a higher improvement in terms of precision, which is in agreement with the intuition that more deterministic log statistics help the miner in mining structured, non-flower-like, models.

6 Conclusion & Future Work

We have provided a theoretical and conceptual notion of when label refinements and abstractions are useful from a process discovery point of view. Based on this notion of usefulness, we have shown a framework based on statistics and information theory to evaluate the usefulness of a label refinement or abstraction. In addition, we have shown the applicability of this statistical framework through a real life smart home case, where our method selected two label refinements out of a larger candidate set that increased the precision of the resulting process model. Methods for automatic inference of useful label refinements from event attributes are still to be explored. Such methods may generate a set of candidate label refinements, after which the statistical evaluation method described in this paper can be used to select the most promising label refinement from a set of candidate label refinements.


  • [1] J. Brank, M. Grobelnik, and D. Mladenic. A survey of ontology evaluation techniques. In Proceedings of the conference on data mining and data warehouses, pages 166–170, 2005.
  • [2] L. Chen, J. Hoey, C. Nugent, D. Cook, and Z. Yu. Sensor-based activity recognition. IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, 42(6):790–808, 2012.
  • [3] O. J. Dunn. Estimation of the medians for dependent variables. The Annals of Mathematical Statistics, 30(1):192–197, 1959.
  • [4] O. J. Dunn. Multiple comparisons among means. Journal of the American Statistical Association, 56(293):52–64, 1961.
  • [5] R. A. Fisher. Statistical methods for research workers. Number 5. Genesis Publishing Pvt Ltd, 1934.
  • [6] S. Kullback and R. A. Leibler. On information and sufficiency. The Annals of Mathematical Statistics, 22(1):79–86, 1951.
  • [7] J. Li, D. Liu, and B. Yang. Process mining: Extending -algorithm to mine duplicate tasks in process logs. In Advances in Web and Network Technologies, and Information Management, volume 4537 of LNCS, pages 396–407. Springer Berlin Heidelberg, 2007.
  • [8] A. Maedche. Ontology learning for the semantic web, volume 665. Springer Science & Business Media, 2012.
  • [9] J. H. McDonald. Handbook of biological statistics, volume 2. Sparky House Publishing Baltimore, MD, 2009.
  • [10] J. Muñoz Gama and J. Carmona. A fresh look at precision in process conformance. In Business Process Management, volume 6336 of LNCS, pages 211–226. Springer Berlin Heidelberg, 2010.
  • [11] J. R. Quinlan. C4. 5: programs for machine learning. Elsevier, 2014.
  • [12] W. Reisig and G. Rozenberg. Lectures on Petri nets I: basic models: advances in Petri nets, volume 1491. Springer Science & Business Media, 1998.
  • [13] A. Rozinat and W. M. P. van der Aalst. Conformance checking of processes based on monitoring real behavior. Information Systems, 33(1):64–95, 2008.
  • [14] T. Sztyler, J. Völker, J. Carmona, O. Meier, and H. Stuckenschmidt. Discovery of personal processes from labeled sensor data–an application of process mining to personalized health care. In Proceedings of the International Workshop on Algorithms & Theories for the Analysis of Event Data, ATAED, pages 22–23, 2015.
  • [15] W. M. P. van der Aalst. Process mining: discovery, conformance and enhancement of business processes. Springer Science & Business Media, 2011.
  • [16] W. M. P. van der Aalst. Distributed process discovery and conformance checking. In Fundamental Approaches to Software Engineering, LNCS, pages 1–25. Springer, 2012.
  • [17] W. M. P. van der Aalst, A. J. M. M. Weijters, and L. Maruster. Workflow mining: Discovering process models from event logs. IEEE Transactions on Knowledge and Data Engineering, 16(9):1128–1142, 2004.
  • [18] B. F. van Dongen and W. M. P. van der Aalst. Multi-phase process mining: Building instance graphs. In Conceptual Modeling–ER 2004, volume 3288 of LNCS, pages 362–376. Springer, 2004.
  • [19] T. van Kasteren, A. Noulas, G. Englebienne, and B. Kröse. Accurate activity recognition in a home setting. In Proceedings of the 10th International Conference on Ubiquitous Computing, pages 1–9. ACM, 2008.
  • [20] S. K. Vanden Broucke. Advanced in Process Mining: Artificial Negative Events and Other Techniques. PhD thesis, KU Leuven, 2014.
  • [21] A. J. M. M. Weijters and J. T. S. Ribeiro. Flexible heuristics miner (fhm). In Proceedings of the 2011 IEEE Symposium on Computational Intelligence and Data Mining, pages 310–317. IEEE, 2011.
  • [22] L. Wen, J. Wang, and J. Sun. Detecting implicit dependencies between tasks from event logs. Frontiers of WWW Research and Development-APWeb 2006, pages 591–603, 2006.