Property-based testing for Spark Streaming

12/20/2018
by   Adrian Riesco, et al.
0

Stream processing has reached the mainstream in the last years, as a new generation of open source distributed stream processing systems, designed for scaling horizontally on commodity hardware, has brought the capability for processing high volume and high velocity data streams to companies of all sizes. In this work we propose a combination of temporal logic and property-based testing (PBT) for dealing with the challenges of testing programs that employ this programming model. We formalize our approach in a discrete time temporal logic for finite words, with some additions to improve the expressiveness of properties, which includes timeouts for temporal operators and a binding operator for letters. In particular we focus on testing Spark Streaming programs written with the Spark API for the functional language Scala, using the PBT library ScalaCheck. For that we add temporal logic operators to a set of new ScalaCheck generators and properties, as part of our testing library sscheck. Under consideration in Theory and Practice of Logic Programming (TPLP).

READ FULL TEXT
research
08/03/2021

Towards Substructural Property-Based Testing

We propose to extend property-based testing to substructural logics to o...
research
07/16/2018

In Praise of Impredicativity: A Contribution to the Formalisation of Meta-Programming

Processing programs as data is one of the successes of functional and lo...
research
11/24/2022

Highest-performance Stream Processing

We present the stream processing library that achieves the highest perfo...
research
06/13/2023

The Stable Model Semantics of Datalog with Metric Temporal Operators

We introduce negation under the stable model semantics in DatalogMTL - a...
research
09/24/2019

An Exploratory Study of How Specialists Deal with Testing in Data Stream Processing Applications

[Background] Nowadays, there is a massive growth of data volume and spee...
research
05/03/2023

GALOIS: A Hybrid and Platform-Agnostic Stream Processing Architecture

With the increasing prevalence of IoT environments, the demand for proce...
research
08/07/2023

Dirigo: Self-scaling Stateful Actors For Serverless Real-time Data Processing

We propose Dirigo, a distributed stream processing service built atop vi...

Please sign up or login with your details

Forgot password? Click here to reset