Towards Automatic Identification of Violation Symptoms of Architecture Erosion

06/14/2023
by   Ruiyin Li, et al.
0

Architecture erosion has a detrimental effect on maintenance and evolution, as the implementation drifts away from the intended architecture. To prevent this, development teams need to understand early enough the symptoms of erosion, and particularly violations of the intended architecture. One way to achieve this, is through the automatic identification of architecture violations from textual artifacts, and particularly code reviews. In this paper, we developed 15 machine learning-based and 4 deep learning-based classifiers with three pre-trained word embeddings to identify violation symptoms of architecture erosion from developer discussions in code reviews. Specifically, we looked at code review comments from four large open-source projects from the OpenStack (Nova and Neutron) and Qt (Qt Base and Qt Creator) communities. We then conducted a survey to acquire feedback from the involved participants who discussed architecture violations in code reviews, to validate the usefulness of our trained classifiers. The results show that the SVM classifier based on word2vec pre-trained word embedding performs the best with an F1-score of 0.779. In most cases, classifiers with the fastText pre-trained word embedding model can achieve relatively good performance. Furthermore, 200-dimensional pre-trained word embedding models outperform classifiers that use 100 and 300-dimensional models. In addition, an ensemble classifier based on the majority voting strategy can further enhance the classifier and outperforms the individual classifiers. Finally, an online survey of the involved developers reveals that the violation symptoms identified by our approaches have practical value and can provide early warnings for impending architecture erosion.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2019

CORE: Automating Review Recommendation for Code Changes

Code review is a common process that is used by developers, in which a r...
research
01/04/2022

Symptoms of Architecture Erosion in Code Reviews: A Study of Two OpenStack Projects

The phenomenon of architecture erosion can negatively impact the mainten...
research
04/16/2019

An Empirical Evaluation of Text Representation Schemes on Multilingual Social Web to Filter the Textual Aggression

This paper attempt to study the effectiveness of text representation sch...
research
08/23/2020

Augmenting Semantic Representation of Depressive Language: from Forums to Microblogs

We discuss and analyze the process of creating word embedding feature re...
research
08/23/2020

Predicting Helpfulness of Online Reviews

E-commerce dominates a large part of the world's economy with many websi...
research
03/22/2021

Identifying Machine-Paraphrased Plagiarism

Employing paraphrasing tools to conceal plagiarized text is a severe thr...
research
07/21/2017

An Error-Oriented Approach to Word Embedding Pre-Training

We propose a novel word embedding pre-training approach that exploits wr...

Please sign up or login with your details

Forgot password? Click here to reset