Optimizing Multiple Multi-Way Stream Joins
We address the joint optimization of multiple stream joins in a scale-out architecture by tailoring prior work on multi-way stream joins to predicate-driven data partitioning schemes. We present an integer linear programming (ILP) formulation for selecting the partitioning and tuple routing with minimal probe load and describe how routing and operator placement can be rewired dynamically at changing data characteristics and arrival or expiration of queries. The presented algorithms and optimization schemes are implemented in CLASH, a data stream processor developed in our group that translates queries to deployable Apache Storm topologies after optimization. The experiments conducted over real-world data exhibit the potential of multi-query optimization of multi-way stream joins and the effectiveness and feasibility of the ILP optimization problem.
READ FULL TEXT