Crowdsmelling: The use of collective knowledge in code smells detection

12/23/2020
by   José Pereira dos Reis, et al.
0

Code smells are seen as major source of technical debt and, as such, should be detected and removed. However, researchers argue that the subjectiveness of the code smells detection process is a major hindrance to mitigate the problem of smells-infected code. We proposed the crowdsmelling approach based on supervised machine learning techniques, where the wisdom of the crowd (of software developers) is used to collectively calibrate code smells detection algorithms, thereby lessening the subjectivity issue. This paper presents the results of a validation experiment for the crowdsmelling approach. In the context of three consecutive years of a Software Engineering course, a total "crowd" of around a hundred teams, with an average of three members each, classified the presence of 3 code smells (Long Method, God Class, and Feature Envy) in Java source code. These classifications were the basis of the oracles used for training six machine learning algorithms. Over one hundred models were generated and evaluated to determine which machine learning algorithms had the best performance in detecting each of the aforementioned code smells. Good performances were obtained for God Class detection (ROC=0.896 for Naive Bayes) and Long Method detection (ROC=0.870 for AdaBoostM1), but much lower for Feature Envy (ROC=0.570 for Random Forrest). Obtained results suggest that crowdsmelling is a feasible approach for the detection of code smells, but further validation experiments are required to cover more code smells and to increase external validity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2019

Code Smell Detection using Multilabel Classification Approach

Code smells are characteristics of the software that indicates a code or...
research
10/18/2021

A Survey on Machine Learning Techniques for Source Code Analysis

Context: The advancements in machine learning techniques have encouraged...
research
05/03/2020

A Machine Learning Based Framework for Code Clone Validation

A code clone is a pair of code fragments, within or between software sys...
research
12/03/2020

Feature-Based Software Design Pattern Detection

Software design patterns are standard solutions to common problems in so...
research
12/16/2020

Code smells detection and visualization: A systematic literature review

Context: Code smells (CS) tend to compromise software quality and also d...
research
03/23/2021

PSIMiner: A Tool for Mining Rich Abstract Syntax Trees from Code

The application of machine learning algorithms to source code has grown ...
research
12/24/2018

Feature Maps: A Comprehensible Software Representation for Design Pattern Detection

Design patterns are elegant and well-tested solutions to recurrent softw...

Please sign up or login with your details

Forgot password? Click here to reset