Testing of Machine Learning Models with Limited Samples: An Industrial Vacuum Pumping Application

08/08/2022
by   Ayan Chatterjee, et al.
0

There is often a scarcity of training data for machine learning (ML) classification and regression models in industrial production, especially for time-consuming or sparsely run manufacturing processes. A majority of the limited ground-truth data is used for training, while a handful of samples are left for testing. Here, the number of test samples is inadequate to properly evaluate the robustness of the ML models under test for classification and regression. Furthermore, the output of these ML models may be inaccurate or even fail if the input data differ from the expected. This is the case for ML models used in the Electroslag Remelting (ESR) process in the refined steel industry to predict the pressure in a vacuum chamber. A vacuum pumping event that occurs once a workday generates a few hundred samples in a year of pumping for training and testing. In the absence of adequate training and test samples, this paper first presents a method to generate a fresh set of augmented samples based on vacuum pumping principles. Based on the generated augmented samples, three test scenarios and one test oracle are presented to assess the robustness of an ML model used for production on an industrial scale. Experiments are conducted with real industrial production data obtained from Uddeholms AB steel company. The evaluations indicate that Ensemble and Neural Network are the most robust when trained on augmented data using the proposed testing strategy. The evaluation also demonstrates the proposed method's effectiveness in checking and improving ML algorithms' robustness in such situations. The work improves software testing's state-of-the-art robustness testing in similar settings. Finally, the paper presents an MLOps implementation of the proposed approach for real-time ML model prediction and action on the edge node and automated continuous delivery of ML software from the cloud.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/15/2022

A Drift Handling Approach for Self-Adaptive ML Software in Scalable Industrial Processes

Most industrial processes in real-world manufacturing applications are c...
research
11/23/2022

Quality Assurance in MLOps Setting: An Industrial Perspective

Today, machine learning (ML) is widely used in industry to provide the c...
research
01/15/2018

Improving Orbit Prediction Accuracy through Supervised Machine Learning

Due to the lack of information such as the space environment condition a...
research
07/13/2023

A Scenario-Based Functional Testing Approach to Improving DNN Performance

This paper proposes a scenario-based functional testing approach for enh...
research
05/17/2023

Neuro-Symbolic AI for Compliance Checking of Electrical Control Panels

Artificial Intelligence plays a main role in supporting and improving sm...
research
04/04/2022

Highly efficient reliability analysis of anisotropic heterogeneous slopes: Machine Learning aided Monte Carlo method

Machine Learning (ML) algorithms are increasingly used as surrogate mode...
research
09/20/2023

Machine Learning Data Suitability and Performance Testing Using Fault Injection Testing Framework

Creating resilient machine learning (ML) systems has become necessary to...

Please sign up or login with your details

Forgot password? Click here to reset