A Comprehensive Survey on Parallelization and Elasticity in Stream Processing
Stream Processing (SP) has evolved as the leading paradigm to process and gain value from the high volume of streaming data produced e.g. in the domain of the Internet of Things. An SP system is a middleware that deploys a network of operators between data sources, such as sensors, and the consuming applications. SP systems typically face intense and highly dynamic data streams. Parallelization and elasticity enables SP systems to process these streams with continuously high quality of service. The current research landscape provides a broad spectrum of methods for parallelization and elasticity in SP. Each method makes specific assumptions and focuses on particular aspects of the problem. However, the literature lacks a comprehensive overview and categorization of the state of the art in SP parallelization and elasticity, which is necessary to consolidate the state of the research and to plan future research directions on this basis. Therefore, in this survey, we study the literature and develop a classification of current methods for both parallelization and elasticity in SP systems.
READ FULL TEXT