The SmartSHARK Repository Mining Data

02/23/2021
by   Alexander Trautsch, et al.
0

The SmartSHARK repository mining data is a collection of rich and detailed information about the evolution of software projects. The data is unique in its diversity and contains detailed information about each change, issue tracking data, continuous integration data, as well as pull request and code review data. Moreover, the data does not contain only raw data scraped from repositories, but also annotations in form of labels determined through a combination of manual analysis and heuristics, as well as links between the different parts of the data set. The SmartSHARK data set provides a rich source of data that enables us to explore research questions that require data from different sources and/or longitudinal data over time.

READ FULL TEXT
research
01/06/2020

The SmartSHARK Ecosystem for Software Repository Mining

Software repository mining is the foundation for many empirical software...
research
08/11/2020

GraphRepo: Fast Exploration in Software Repository Mining

Mining and storage of data from software repositories is typically done ...
research
05/03/2022

Tooling for Time- and Space-efficient git Repository Mining

Software projects under version control grow with each commit, accumulat...
research
03/25/2019

git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories

Data from software repositories have become an important foundation for ...
research
08/08/2020

More Effective Software Repository Mining

Background: Data mining and analyzing of public Git software repositorie...
research
12/09/2015

ShapeNet: An Information-Rich 3D Model Repository

We present ShapeNet: a richly-annotated, large-scale repository of shape...
research
09/14/2013

Ultrametric Component Analysis with Application to Analysis of Text and of Emotion

We review the theory and practice of determining what parts of a data se...

Please sign up or login with your details

Forgot password? Click here to reset