I Introduction
Regular releases of software promise to improve product quality and high customer satisfaction [Semiautomatic]. Release notes inform users about essential project changes, e.g., bug fixes, feature enhancements, source code changes, from the previous version to the latest version [arena1]. Practitioners (e.g., project managers, developers and clients) use release notes in various software development phases, e.g., requirements engineering, software programming, debugging and testing phase [Semiautomatic, softwaredocumentation]. Several empirical studies have been done to analyze the release note contents [rnempirical, releasenote, arena1]. For example, Moreno et al. [arena1] identify 17 types of changes that can be included in the release notes, and Bi et al. [rnempirical]classify those changes into eight categories. Futhermore, Abebe et al. [releasenote] discover additional six different types of information in release notes (described in Section II). Despite the existing studies, in reality practitioners face difficulties in release note productions and usages because of unclear contents with poor structural presentations. The documented information in release notes is currently scattered and poorly organized, and the information is described vaguely [rnempirical]. Consequently, release notes help limited release note users in many cases. Therefore, identifying relevant software artifacts and linking them with release notes can help to resolve this issue [rnempirical].
The release management process consists of several activities from release initiating to closing [End-UserPerspectivesinRel]. Different target users are involved in the release process, and they are looking for different types of information regarding their project roles [Semiautomatic] in release notes. Bi et al. [rnempirical] classify users into two categories: (i) Release Note Producer and (ii) Release Note User. Moreover, the authors cover significant discrepancies between release note producers and users in perceiving release notes. For example, release note producers focus on high-level changes and design decisions of development. In contrast, release note users expect detailed descriptions of new features compared to the previous release. Therefore, difficulties to use is another problem of release notes usage. Because it is challenging to know the requirements for the release notes concerning the target users, conducting a survey study with practitioners can help understand the specific audiences’ needs.
Our study has two motives: (1) to investigate the relevant software artifacts that can help to classify and structure the information in release notes; and (2) to understand the target users’ requirements to tailor the release note contents.
-
Exploratory Study: we extract and analyze 3,347 release notes of 21 GitHub projects and then separate release notes’ contents (or sentences). We identify different artifacts related to the contents and classify the artifacts into three categories. The study design and result analysis are described in Section III-B and Section LABEL:label:result1 respectively.
-
Survey Study: we gather practitioners’ opinions on release notes in practice and receive responses from 32 participants. We classify these participants into two categories: internal and external team members. The study design and result analysis are described in Section LABEL:study2 and Section LABEL:label:result2 respectively.
The key contributions of our study:
-
We extract the data from GitHub and develop a dataset for the contents of release notes.
-
We identify essential software artifacts from the dataset those help to produce well-structured release notes and classify the contents.
-
We analyze the response of the participants and summarize them, which can aid tailoring release notes automatically for different stakeholders.
Ii Background
Ii-a Release Note Contents
Different communities produce release notes according to their own guidelines, and no common standards exist for writing release notes. In order to understand the release note contents (see Table I), several empirical studies have been done to investigate and categorize the documented information in release notes. For example, Moreno et al. [arena1] manually inspected 990 release notes to analyze and classify their content into 17 categories by focusing on information at finer level granularity. Similarly, Bi et al. [rnempirical] classified the documented information of release notes into eight main categories, comparatively higher-level classification.
On the other hand, Abebe et al. [releasenote] identify six different information, e.g., titles, system overview, resource requirements, installation, addressed issues and caveat, that are included in release notes which contents are relatively high-level. Authors mainly focus on software end-users’ perceptions of release notes. Klepper et al. [Semiautomatic] identify some additional information, for example, technical information and testing instructions, in the release notes. Our study explores the different software artifacts relevant to the release note contents.
Study | Contents Category | Percentage |
---|---|---|
Moreno et al. [arena1] | Fixed Bugs | 90% |
New Features | 46% | |
New Code Components | 43% | |
Modified Code Components | 40% | |
Modified Features | 26% | |
Refactoring Operations | 21% | |
Changes to Documentation | 20% | |
Upgraded Library Dep. | 16% | |
Deprecated Code Components | 10% | |
Deleted Code Components | 9% | |
Changes to Config. Files | 8% | |
Changes to Code Components Visibility | 7% | |
Changes to Test Suites | 7% | |
Known Issues | 6% | |
Replaced Code Components | 5% | |
Architectural Changes | 3% | |
Changes to Licenses | 2% | |
Bi et al. [rnempirical] | Issues Fixed | 79.3% |
New features | 55.1% | |
System internal changes | 25.1% | |
Non-functional requirements | 10.3% | |
Documentation update | 9.5% | |
Configuration | 2.8% | |
Required further actions | 2.1% | |
Refactoring and reuse | 1.9% |
Ii-B Practitioners of Release Notes
Different role-based practitioners are involved in a software release process [Semiautomatic]. Bi et al. [rnempirical] classify the stakeholders into two groups: release note producers and users. Architects and team managers are mainly responsible for producing release notes. Developers basically write release notes to reflect internal code changes, whereas testers and operators use release notes to perform their tasks. The purpose of release notes usages for different stakeholders (e.g., developers and testers) regarding the phases are described below:
-
In the pre-alpha phase, project managers and clients use the release notes to discuss the project progress with new functionalities and significant issues.
-
The release notes of an alpha version are used both by developers and testers. During this phase, developers first debug all the critical bugs found in a pre-alpha phase. Then, the testers assess the functionality of the application and compare the expected value with the final output value by utilizing the release notes.
-
To test the system in the real environment, clients and end-users use the beta version of the release notes.
-
Team managers release a stable version of the system and highlight activities in the release notes of the RC version.
-
Before deploying the final version of product, the team members write the release notes for the integrators (who are using a library in their code) and end-users. Therefore, the release notes of the final version need to be concise and properly understandable for the target audience.
However, the documented ineffective and unnecessary information creates problems on the usage of release notes for the targeted audiences [rnempirical]. Therefore, our study investigates the valuable information of release notes depending on the target users’ perceptions.
Iii Study Design
This section describes our research questions and study processes. Figure 1 represents the overview of this study.
Iii-a Research Questions
We address the following research questions:
RQ1: What software artifacts are important for preparing release notes? To answer the research question, we conduct an exploratory study on 3,347 release notes from 21 GitHub project repositories for understanding the release note contents. Majority of the cases, we identify three crucial software artifacts, e.g., commits, issues and pull-requests, from GitHub and one artifact, e.g., common vulnerabilities and exposures (CVE) issues, is extracted from external sources. These valuable sources can help to prepare quality release notes. Moreover, we find other key artifacts that can assist in maintaining well-structured release notes.
RQ2: What types of release note contents may vary depending on the software development role of practitioners? Everyone involved in a software development project can be a practitioner of release notes., and the addressed contents of the release notes can be modified depending on the users’ particular information needs [Semiautomatic]. To understand the targeted users’ needs, we prepare an online survey study and learn users’ opinions about release notes. This survey study may help produce high-quality release notes and tailor the documented release notes depending on users’ needs.

Iii-B Study 1: Exploratory Study
One key goal of our study is to identify the relevant artifacts of release notes (i.e., RQ1). Therefore, we targeted open-source projects and collected release notes from GitHub-hosted projects.
Iii-B1 Data Collection
First, we collect release notes of open-source projects from GitHub. In this step, we use search option of GitHub by sorting the most number of stars of three-top languages, e.g., JavaScript, Java and Python, based on active repositories [githut]. To eliminate the trivial projects, we define following criteria for a project selection from GitHub:
-
created: minimum of three years ago and is active
-
forks and stars: 1,000 times forks and 8,000 stars
-
contributors: 30 committers
-
total commit: 2,000
-
release notes: 30
-
total resolved issues: 500 issues
-
total pull-requests: 500 pull-requests
Then, we select total 21 projects, i.e., 8 JavaScript, 7 Java and 6 Python projects, based on the number of stars. Among the projects, we did not consider the non-engineering ones, e.g., iluwatar/java-design-patterns and kdn251/interviews. Table II represents the detail information about the selected repositories. After project selection, we extract the release notes using the data extraction tool. Data is available on GitHub [dataavailable].
SL. | Repository | Domain | # Releases |
---|---|---|---|
1 | vuejs/vue | Web libraries and frameworks | 210 |
2 | facebook/react | Web libraries and frameworks | 96 |
3 | twbs/bootstrap | Web libraries and frameworks | 73 |
4 | axios/axios | Web libraries and frameworks | 32 |
5 | nodejs/node | System software | 252 |
6 | mrdoob/three.js | Web libraries and frameworks | 125 |
7 | mui-org/material-ui | Web libraries and frameworks | 322 |
8 | chartjs/Chart.js | Web libraries and frameworks | 81 |
9 | spring-projects/spring-boot | Software tools | 112 |
10 | elastic/elasticsearch | Web libraries and frameworks | 60 |
11 | ReactiveX/RxJava | System software | 225 |
12 | google/guava | Non-web libraries and frameworks | 34 |
13 | PhilJay/MPAndroidChart | Non-web libraries and frameworks | 44 |
14 | redisson/redisson | Non-web libraries and frameworks | 103 |
15 | jenkinsci/jenkins | System software | 128 |
16 | tensorflow/tensorflow | Software tools | 152 |
17 | tiangolo/fastapi | Non-web libraries and frameworks | 113 |
18 | getsentry/sentry | Non-web libraries and frameworks | 38 |
19 | pandas-dev/pandas | Non-web libraries and frameworks | 81 |
20 | apache/airflow | Software tools | 41 |
21 | home-assistant/core | Application software | 847 |

Iii-B2 Data Analysis
First, we filter out the empty release notes from the extracted data. Second, we eliminate some information, e.g., contributors’ name, to analyze the release note contents. Third, we split the sentences, i.e., contents, and headings from release notes. For example, Fig. 2 represents a release note. Here, Bug Fixes, Features are heading and the bullet listed information are contents. Then, we extract the URLs from each sentences and maintain separate column in the dataset for further analysis.