Automated Summarization of Stack Overflow Posts

05/26/2023
by   Bonan Kou, et al.
0

Software developers often resort to Stack Overflow (SO) to fill their programming needs. Given the abundance of relevant posts, navigating them and comparing different solutions is tedious and time-consuming. Recent work has proposed to automatically summarize SO posts to concise text to facilitate the navigation of SO posts. However, these techniques rely only on information retrieval methods or heuristics for text summarization, which is insufficient to handle the ambiguity and sophistication of natural language. This paper presents a deep learning based framework called ASSORT for SO post summarization. ASSORT includes two complementary learning methods, ASSORT_S and ASSORT_IS, to address the lack of labeled training data for SO post summarization. ASSORT_S is designed to directly train a novel ensemble learning model with BERT embeddings and domainspecific features to account for the unique characteristics of SO posts. By contrast, ASSORT_IS is designed to reuse pre-trained models while addressing the domain shift challenge when no training data is present (i.e., zero-shot learning). Both ASSORT_S and ASSORT_IS outperform six existing techniques by at least 13 respectively in terms of the F1 score. Furthermore, a human study shows that participants significantly preferred summaries generated by ASSORT_S and ASSORT_IS over the best baseline, while the preference difference between ASSORT_S and ASSORT_IS was small.

READ FULL TEXT
research
01/01/2023

Floods Relevancy and Identification of Location from Twitter Posts using NLP Techniques

This paper presents our solutions for the MediaEval 2022 task on Disaste...
research
06/13/2022

An Exploration of Post-Editing Effectiveness in Text Summarization

Automatic summarization methods are efficient but can suffer from low qu...
research
03/21/2022

PTM4Tag: Sharpening Tag Recommendation of Stack Overflow Posts with Pre-trained Models

Stack Overflow is often viewed as the most influential Software Question...
research
03/13/2023

Representation Learning for Stack Overflow Posts: How Far are We?

The tremendous success of Stack Overflow has accumulated an extensive co...
research
01/27/2022

Aspect-Based API Review Classification: How Far Can Pre-Trained Transformer Model Go?

APIs (Application Programming Interfaces) are reusable software librarie...
research
03/08/2021

Few-Shot Learning of an Interleaved Text Summarization Model by Pretraining with Synthetic Data

Interleaved texts, where posts belonging to different threads occur in a...
research
08/30/2019

Exploring Domain Shift in Extractive Text Summarization

Although domain shift has been well explored in many NLP applications, i...

Please sign up or login with your details

Forgot password? Click here to reset