Exploring Software Reusability Metrics with Q A Forum Data

05/18/2020
by   Matthew T. Patrick, et al.
0

Question and answer (Q A) forums contain valuable information regarding software reuse, but they can be challenging to analyse due to their unstructured free text. Here we introduce a new approach (LANLAN), using word embeddings and machine learning, to harness information available in StackOverflow. Specifically, we consider two different kinds of user communication describing difficulties encountered in software reuse: 'problem reports' point to potential defects, while 'support requests' ask for clarification on software usage. Word embeddings were trained on 1.6 billion tokens from StackOverflow and applied to identify which Q A forum messages (from two large open source projects: Eclipse and Bioconductor) correspond to problem reports or support requests. LANLAN achieved an area under the receiver operator curve (AUROC) of over 0.9; it can be used to explore the relationship between software reusability metrics and difficulties encountered by users, as well as predict the number of difficulties users will face in the future. Q A forum data can help improve understanding of software reuse, and may be harnessed as an additional resource to evaluate software reusability metrics.

READ FULL TEXT
research
10/10/2011

Open Source Software: How Can Design Metrics Facilitate Architecture Recovery?

Modern software development methodologies include reuse of open source c...
research
08/27/2008

Free and Open Source Software for Development

Development organizations and International Non-Governmental Organizatio...
research
05/27/2020

Code Duplication and Reuse in Jupyter Notebooks

Duplicating one's own code makes it faster to write software. This exped...
research
03/01/2021

Understanding Emotions of Developer Community Towards Software Documentation

The availability of open-source projects facilitates developers to contr...
research
11/19/2017

Intelligent Word Embeddings of Free-Text Radiology Reports

Radiology reports are a rich resource for advancing deep learning applic...
research
07/10/2020

Topic Modeling on User Stories using Word Mover's Distance

Requirements elicitation has recently been complemented with crowd-based...
research
10/02/2022

ALT: A software for readability analysis of Portuguese-language texts

In the initial stage of human life, communication, seen as a process of ...

Please sign up or login with your details

Forgot password? Click here to reset