Software Module Clustering: An In-Depth Literature Analysis

12/02/2020
by   Qusay I. Sarhan, et al.
0

Software module clustering is an unsupervised learning method used to cluster software entities (e.g., classes, modules, or files) with similar features. The obtained clusters may be used to study, analyze, and understand the software entities' structure and behavior. Implementing software module clustering with optimal results is challenging. Accordingly, researchers have addressed many aspects of software module clustering in the past decade. Thus, it is essential to present the research evidence that has been published in this area. In this study, 143 research papers from well-known literature databases that examined software module clustering were reviewed to extract useful data. The obtained data were then used to answer several research questions regarding state-of-the-art clustering approaches, applications of clustering in software engineering, clustering processes, clustering algorithms, and evaluation methods. Several research gaps and challenges in software module clustering are discussed in this paper to provide a useful reference for researchers in this field.

READ FULL TEXT

page 3

page 7

page 20

research
08/22/2023

Towards an Understanding of Large Language Models in Software Engineering Tasks

Large Language Models (LLMs) have drawn widespread attention and researc...
research
09/21/2017

Recovery of Architecture Module Views using an Optimized Algorithm Based on Design Structure Matrices

Design structure matrices (DSMs) are useful to represent high-level syst...
research
09/09/2022

Pitfalls and Guidelines for Using Time-Based Git Data

Many software engineering research papers rely on time-based data (e.g.,...
research
01/30/2018

Data-Driven Search-based Software Engineering

This paper introduces Data-Driven Search-based Software Engineering (DSE...
research
03/21/2021

Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based Git Data

Many software engineering research papers rely on time-based data (e.g.,...
research
07/14/2022

Seeking the Truth Beyond the Data. An Unsupervised Machine Learning Approach

Clustering is an unsupervised machine learning methodology where unlabel...
research
12/01/2021

Monolith to Microservices: Representing Application Software through Heterogeneous GNN

Monolith software applications encapsulate all functional capabilities i...

Please sign up or login with your details

Forgot password? Click here to reset