Feature Maps: A Comprehensible Software Representation for Design Pattern Detection

12/24/2018
by   Hannes Thaller, et al.
0

Design patterns are elegant and well-tested solutions to recurrent software development problems. They are the result of software developers dealing with problems that frequently occur, solving them in the same or a slightly adapted way. A pattern's semantics provide the intent, motivation, and applicability, describing what it does, why it is needed, and where it is useful. Consequently, design patterns encode a well of information. Developers weave this information into their systems whenever they use design patterns to solve problems. This work presents Feature Maps, a flexible human- and machine-comprehensible software representation based on micro-structures. Our algorithm, the Feature-Role Normalization, presses the high-dimensional, inhomogeneous vector space of micro-structures into a feature map. We apply these concepts to the problem of detecting instances of design patterns in source code. We evaluate our methodology on four design patterns, a wide range of balanced and imbalanced labeled training data, and compare classical machine learning (Random Forests) with modern deep learning approaches (Convolutional Neural Networks). Feature maps yield robust classifiers even under challenging settings of strongly imbalanced data distributions without sacrificing human comprehensibility. Results suggest that feature maps are an excellent addition in the software analysis toolbox that can reveal useful information hidden in the source code.

READ FULL TEXT

page 1

page 3

page 6

research
12/03/2020

Feature-Based Software Design Pattern Detection

Software design patterns are standard solutions to common problems in so...
research
07/17/2018

Pseudo-Feature Generation for Imbalanced Data Analysis in Deep Learning

We generate pseudo-features by multivariate probability distributions ob...
research
10/17/2019

Deep Learning Anti-patterns from Code Metrics History

Anti-patterns are poor solutions to recurring design problems. Number of...
research
04/16/2022

ZeroIn: Characterizing the Data Distributions of Commits in Software Repositories

Modern software development is based on a series of rapid incremental ch...
research
05/30/2018

On the Spectrum of Random Features Maps of High Dimensional Data

Random feature maps are ubiquitous in modern statistical machine learnin...
research
04/28/2019

A Feature Based Methodology for Variable Requirements Reverse Engineering

In the past years, software reverse engineering dealt with source code u...
research
12/23/2020

Crowdsmelling: The use of collective knowledge in code smells detection

Code smells are seen as major source of technical debt and, as such, sho...

Please sign up or login with your details

Forgot password? Click here to reset