BottleMod: Modeling Data Flows and Tasks for Fast Bottleneck Analysis
In the recent years, scientific workflows gained more and more popularity. In scientific workflows, tasks are typically treated as black boxes. Dealing with their complex interrelations to identify optimization potentials and bottlenecks is therefore inherently hard. The progress of a scientific workflow depends on several factors, including the available input data, the available computational power, and the I/O and network bandwidth. Here, we tackle the problem of predicting the workflow progress with very low overhead. To this end, we look at suitable formalizations for the key parameters and their interactions which are sufficiently flexible to describe the input data consumption, the computational effort and the output production of the workflow's tasks. At the same time they allow for computationally simple and fast performance predictions, including a bottleneck analysis over the workflow runtime. A piecewise-defined bottleneck function is derived from the discrete intersections of the task models' limiting functions. This allows to estimate potential performance gains from overcoming the bottlenecks and can be used as a basis for optimized resource allocation and workflow execution.
READ FULL TEXT