X-COBOL: A Dataset of COBOL Repositories

06/08/2023
by   Mir Sameed Ali, et al.
0

Despite being proposed as early as 1959, COBOL (Common Business-Oriented Language) still predominantly acts as an integral part of the majority of operations of several financial, banking, and governmental organizations. To support the inevitable modernization and maintenance of legacy systems written in COBOL, it is essential for organizations, researchers, and developers to understand the nature and source code of COBOL programs. However, to the best of our knowledge, we are unaware of any dataset that provides data on COBOL software projects, motivating the need for the dataset. Thus, to aid empirical research on comprehending COBOL in open-source repositories, we constructed a dataset of 84 COBOL repositories mined from GitHub, containing rich metadata on the development cycle of the projects. We envision that researchers can utilize our dataset to study COBOL projects' evolution, code properties and develop tools to support their development. Our dataset also provides 1255 COBOL files present inside the mined repositories. The dataset and artifacts are available at https://doi.org/10.5281/zenodo.7968845.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2018

Public Git Archive: a Big Code dataset for all

The number of open source software projects has been growing exponential...
research
01/06/2023

Codepod: A Namespace-Aware, Hierarchical Jupyter for Interactive Development at Scale

Jupyter is a browser-based interactive development environment that has ...
research
07/01/2019

Understanding GCC Builtins to Develop Better Tools

C programs can use compiler builtins to provide functionality that the C...
research
01/01/2019

Information Systems Development and Evolution: A replication study on work distribution in Norwegian Organizations

The information systems landscape is at first sight very different from ...
research
04/02/2023

GitHub OSS Governance File Dataset

Open-source Software (OSS) has become a valuable resource in both indust...
research
11/26/2020

Early Life Cycle Software Defect Prediction. Why? How?

Many researchers assume that, for software analytics, "more data is bett...
research
07/18/2022

Knights and Gold Stars: A Tale of InnerSource Incentivization

Given the success of the open source phenomenon, it is not surprising th...

Please sign up or login with your details

Forgot password? Click here to reset