A Variability-Aware Design Approach to the Data Analysis Modeling Process

The massive amount of current data has led to many different forms of data analysis processes that aim to explore this data to uncover valuable insights. Methodologies to guide the development of big data science projects, including CRISP-DM and SEMMA, have been widely used in industry and academia. The data analysis modeling phase, which involves decisions on the most appropriate models to adopt, is at the core of these projects. However, from a software engineering perspective, the design and automation of activities performed in this phase are challenging. In this paper, we propose an approach to the data analysis modeling process which involves (i) the assessment of the variability inherent in the CRISP-DM data analysis modeling phase and the provision of feature models that represent this variability; (ii) the definition of a framework structural design that captures the identified variability; and (iii) evaluation of the developed framework design in terms of the possibilities for process automation. The proposed approach advances the state of the art by offering a variability-aware design solution that can enhance system flexibility, potentially leading to novel software frameworks which can significantly improve the level of automation in data analysis modeling process.

READ FULL TEXT
research
04/26/2019

Evaluating the Success of a Data Analysis

A fundamental problem in the practice and teaching of data science is ho...
research
06/19/2020

REBD:A Conceptual Framework for Big Data Requirements Engineering

Requirements engineering (RE), as a part of the project development life...
research
11/24/2022

Lessons Learned to Improve the UX Practices in Agile Projects Involving Data Science and Process Automation

Context: User-Centered Design and Agile methodologies focus on human iss...
research
03/09/2021

Design Principles for Data Analysis

The data science revolution has led to an increased interest in the prac...
research
12/12/2019

Thinging as a Way of Modeling in Poiesis: Applications in Software Engineering

From a software design perspective, a clear definition of design can enh...
research
07/19/2019

Continuously Updated Data Analysis Systems

When doing data science, it's important to know what you're building. Th...
research
09/05/2019

Network-Based Approach for Modeling and Analyzing Coronary Angiography

Significant intra-observer and inter-observer variability in the interpr...

Please sign up or login with your details

Forgot password? Click here to reset