A quality model for evaluating and choosing a stream processing framework architecture

01/25/2019
by   Youness Dendane, et al.
0

Today, we have to deal with many data (Big data) and we need to make decisions by choosing an architectural framework to analyze these data coming from different area. Due to this, it become problematic when we want to process these data, and even more, when it is continuous data. When you want to process some data, you have to first receive it, store it, and then query it. This is what we call Batch Processing. It works well when you process big amount of data, but it finds its limits when you want to get fast (or real-time) processing results, such as financial trades, sensors, user session activity, etc. The solution to this problem is stream processing. Stream processing approach consists of data arriving record by record and rather than storing it, the processing should be done directly. Therefore, direct results are needed with a latency that may vary in real-time. In this paper, we propose an assessment quality model to evaluate and choose stream processing frameworks. We describe briefly different architectural frameworks such as Kafka, Spark Streaming and Flink that address the stream processing. Using our quality model, we present a decision tree to support engineers to choose a framework following the quality aspects. Finally, we evaluate our model doing a case study to Twitter and Netflix streaming.

READ FULL TEXT
research
06/18/2018

AlertMix: A Big Data platform for multi-source streaming data

The demand for stream processing is increasing at an unprecedented rate....
research
04/28/2016

Architectural Impact on Performance of In-memory Data Analytics: Apache Spark Case Study

While cluster computing frameworks are continuously evolving to provide ...
research
12/27/2016

Distributed Real-Time Sentiment Analysis for Big Data Social Streams

Big data trend has enforced the data-centric systems to have continuous ...
research
09/28/2017

Intelligent Perioperative System: Towards Real-time Big Data Analytics in Surgery Risk Assessment

Surgery risk assessment is an effective tool for physicians to manage th...
research
04/12/2018

BigSR: an empirical study of real-time expressive RDF stream reasoning on modern Big Data platforms

The trade-off between language expressiveness and system scalability (E&...
research
08/09/2021

Towards a Generic Multimodal Architecture for Batch and Streaming Big Data Integration

Big Data are rapidly produced from various heterogeneous data sources. T...
research
07/20/2018

Apache Spark Streaming and HarmonicIO: A Performance and Architecture Comparison

Studies have demonstrated that Apache Spark, Flink and related framework...

Please sign up or login with your details

Forgot password? Click here to reset