On the Impact of Programming Languages on Code Quality

01/29/2019
by   Emery D. Berger, et al.
0

This paper is a reproduction of work by Ray et al. which claimed to have uncovered a statistically significant association between eleven programming languages and software defects in projects hosted on GitHub. First we conduct an experimental repetition, repetition is only partially successful, but it does validate one of the key claims of the original work about the association of ten programming languages with defects. Next, we conduct a complete, independent reanalysis of the data and statistical modeling steps of the original study. We uncover a number of flaws that undermine the conclusions of the original study as only four languages are found to have a statistically significant association with defects, and even for those the effect size is exceedingly small. We conclude with some additional sources of bias that should be investigated in follow up work and a few best practice recommendations for similar efforts.

READ FULL TEXT
research
11/27/2019

FSE/CACM Rebuttal^2: Correcting A Large-Scale Study of Programming Languages and Code Quality in GitHub

Ray, Devanbu and Filkov issued a rebuttal of our TOPLAS paper "On the Im...
research
11/18/2019

Rebuttal to Berger et al., TOPLAS 2019

Berger et al., published in TOPLAS 2019, is a critique of our 2014 FSE c...
research
01/29/2021

Applying Bayesian Analysis Guidelines to Empirical Software Engineering Data: The Case of Programming Languages and Code Quality

Statistical analysis is the tool of choice to turn data into information...
research
04/28/2020

Learned Garbage Collection

Several programming languages use garbage collectors (GCs) to automatica...
research
06/02/2020

Analyzing programming languages by community characteristics on Github and StackOverflow

The choice of programming language is a very important decision as it no...
research
01/18/2023

Towards Causal Analysis of Empirical Software Engineering Data: The Impact of Programming Languages on Coding Competitions

There is abundant observational data in the software engineering domain,...
research
10/17/2020

PPL Bench: Evaluation Framework For Probabilistic Programming Languages

We introduce PPL Bench, a new benchmark for evaluating Probabilistic Pro...

Please sign up or login with your details

Forgot password? Click here to reset