Looking for related discussions on GitHub Discussions

06/23/2022
by   Márcia lima, et al.
0

Software teams are increasingly adopting different tools and communication channels to aid the software collaborative development model and coordinate tasks. Among such resources, Programming Community-based Question Answering (PCQA) forums have become widely used by developers. Such environments enable developers to get and share technical information. Interested in supporting the development and management of Open Source Software (OSS) projects, GitHub announced GitHub Discussions - a native forum to facilitate collaborative discussions between users and members of communities hosted on the platform. As GitHub Discussions resembles PCQA forums, it faces challenges similar to those faced by such environments, which include the occurrence of related discussions (duplicates or near-duplicated posts). While duplicate posts have the same content - and may be exact copies - near-duplicates share similar topics and information. Both can introduce noise to the platform and compromise project knowledge sharing. In this paper, we address the problem of detecting related posts in GitHub Discussions. To do so, we propose an approach based on a Sentence-BERT pre-trained model: the RD-Detector. We evaluated RD-Detector using data from different OSS communities. OSS maintainers and Software Engineering (SE) researchers manually evaluated the RD-Detector results, which achieved 75 out practical applications of the approach, such as merging the discussions' threads and making discussions as comments on one another. OSS maintainers can benefit from RD-Detector to address the labor-intensive task of manually detecting related discussions and answering the same question multiple times.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2021

Mining DEV for social and technical insights about software development

Software developers are social creatures: they communicate, collaborate,...
research
01/02/2020

Dataset of Video Game Development Problems

Different from traditional software development, there is little informa...
research
03/02/2021

Practitioner-generated blog posts as evidence for software engineering research: attitudinal survey and preliminary checklist

Background: Blog posts are frequently used by software practitioners to ...
research
10/31/2018

SIEVE: Helping Developers Sift Wheat from Chaff via Cross-Platform Analysis

Software developers have benefited from various sources of knowledge suc...
research
05/09/2019

A Topological Analysis of Communication Channels for Knowledge Sharing in Contemporary GitHub Projects

With over 28 million developers, success of GitHub collaborative platfor...
research
03/11/2021

Bluejay: A Cross-Tooling Audit Framework For Agile Software Teams

Agile software teams are expected to follow a number of specific Team Pr...

Please sign up or login with your details

Forgot password? Click here to reset