What are the Machine Learning best practices reported by practitioners on Stack Exchange?

by   Anamaria Mojica-Hanke, et al.

Machine Learning (ML) is being used in multiple disciplines due to its powerful capability to infer relationships within data. In particular, Software Engineering (SE) is one of those disciplines in which ML has been used for multiple tasks, like software categorization, bugs prediction, and testing. In addition to the multiple ML applications, some studies have been conducted to detect and understand possible pitfalls and issues when using ML. However, to the best of our knowledge, only a few studies have focused on presenting ML best practices or guidelines for the application of ML in different domains. In addition, the practices and literature presented in previous literature (i) are domain-specific (e.g., concrete practices in biomechanics), (ii) describe few practices, or (iii) the practices lack rigorous validation and are presented in gray literature. In this paper, we present a study listing 127 ML best practices systematically mining 242 posts of 14 different Stack Exchange (STE) websites and validated by four independent ML experts. The list of practices is presented in a set of categories related to different stages of the implementation process of an ML-enabled system; for each practice, we include explanations and examples. In all the practices, the provided examples focus on SE tasks. We expect this list of practices could help practitioners to understand better the practices and use ML in a more informed way, in particular newcomers to this new area that sits at the intersection of software engineering and machine learning.


page 1

page 2

page 3

page 4


Towards machine learning guided by best practices

Nowadays, machine learning (ML) is being used in software systems with m...

On Using Information Retrieval to Recommend Machine Learning Good Practices for Software Engineers

Machine learning (ML) is nowadays widely used for different purposes and...

Practices for Engineering Trustworthy Machine Learning Applications

Following the recent surge in adoption of machine learning (ML), the neg...

Using AntiPatterns to avoid MLOps Mistakes

We describe lessons learned from developing and deploying machine learni...

Continuous Integration of Machine Learning Models with ease.ml/ci: Towards a Rigorous Yet Practical Treatment

Continuous integration is an indispensable step of modern software engin...

Challenges in creative generative models for music: a divergence maximization perspective

The development of generative Machine Learning (ML) models in creative p...

Quality issues in Machine Learning Software Systems

Context: An increasing demand is observed in various domains to employ M...

Please sign up or login with your details

Forgot password? Click here to reset