MLinter: Learning Coding Practices from Examples-Dream or Reality?

01/24/2023
by   Corentin Latappy, et al.
0

Coding practices are increasingly used by software companies. Their use promotes consistency, readability, and maintainability, which contribute to software quality. Coding practices were initially enforced by general-purpose linters, but companies now tend to design and adopt their own company-specific practices. However, these company-specific practices are often not automated, making it challenging to ensure they are shared and used by developers. Converting these practices into linter rules is a complex task that requires extensive static analysis and language engineering expertise. In this paper, we seek to answer the following question: can coding practices be learned automatically from examples manually tagged by developers? We conduct a feasibility study using CodeBERT, a state-of-the-art machine learning approach, to learn linter rules. Our results show that, although the resulting classifiers reach high precision and recall scores when evaluated on balanced synthetic datasets, their application on real-world, unbalanced codebases, while maintaining excellent recall, suffers from a severe drop in precision that hinders their usability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2021

Report From The Trenches: A Case Study In Modernizing Software Development Practices

One factor of success in software development companies is their ability...
research
06/12/2019

Better Code, Better Sharing:On the Need of Analyzing Jupyter Notebooks

By bringing together code, text, and examples, Jupyter notebooks have be...
research
10/08/2022

The importance of good coding practices for data scientists

Many data science students and practitioners are reluctant to adopt good...
research
06/12/2023

Are Software Updates Useless Against Advanced Persistent Threats?

A dilemma worth Shakespeare's Hamlet is increasingly haunting companies ...
research
09/06/2018

Improving Development Practices through Experimentation: an Industrial TDD Case

Test-Driven Development (TDD), an agile development approach that enforc...
research
12/20/2022

A Portal for High-Precision Atomic Data and Computation: Design and Best Practices

The Atom portal, udel.edu/atom, provides the scientific community with e...
research
12/18/2020

An Empirical Investigation of Command-Line Customization

The interactive command line, also known as the shell, is a prominent me...

Please sign up or login with your details

Forgot password? Click here to reset