DeepAI AI Chat
Log In Sign Up

On Testing Machine Learning Programs

by   Houssem Ben Braiek, et al.
Corporation de l'ecole Polytechnique de Montreal

Nowadays, we are witnessing a wide adoption of Machine learning (ML) models in many safety-critical systems, thanks to recent breakthroughs in deep learning and reinforcement learning. Many people are now interacting with systems based on ML every day, e.g., voice recognition systems used by virtual personal assistants like Amazon Alexa or Google Home. As the field of ML continues to grow, we are likely to witness transformative advances in a wide range of areas, from finance, energy, to health and transportation. Given this growing importance of ML-based systems in our daily life, it is becoming utterly important to ensure their reliability. Recently, software researchers have started adapting concepts from the software testing domain (e.g., code coverage, mutation testing, or property-based testing) to help ML engineers detect and correct faults in ML programs. This paper reviews current existing testing practices for ML programs. First, we identify and explain challenges that should be addressed when testing ML programs. Next, we report existing solutions found in the literature for testing ML programs. Finally, we identify gaps in the literature related to the testing of ML programs and make recommendations of future research directions for the scientific community. We hope that this comprehensive review of software testing practices will help ML engineers identify the right approach to improve the reliability of their ML-based systems. We also hope that the research community will act on our proposed research directions to advance the state of the art of testing for ML programs.


page 1

page 2

page 3

page 4


Software Architecture for ML-based Systems: What Exists and What Lies Ahead

The increasing usage of machine learning (ML) coupled with the software ...

Adaptive Immunity for Software: Towards Autonomous Self-healing Systems

Testing and code reviews are known techniques to improve the quality and...

Blockchain Testing: Challenges, Techniques, and Research Directions

Specific testing solutions targeting blockchain-based software are gaini...

How to Certify Machine Learning Based Safety-critical Systems? A Systematic Literature Review

Context: Machine Learning (ML) has been at the heart of many innovations...

CleanML: A Benchmark for Joint Data Cleaning and Machine Learning [Experiments and Analysis]

It is widely recognized that the data quality affects machine learning (...

Thinking Beyond Distributions in Testing Machine Learned Models

Testing practices within the machine learning (ML) community have center...

A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and Fairness

Speech-centric machine learning systems have revolutionized many leading...