PrIU: A Provenance-Based Approach for Incrementally Updating Regression Models

by   Yinjun Wu, et al.

The ubiquitous use of machine learning algorithms brings new challenges to traditional database problems such as incremental view update. Much effort is being put in better understanding and debugging machine learning models, as well as in identifying and repairing errors in training datasets. Our focus is on how to assist these activities when they have to retrain the machine learning model after removing problematic training samples in cleaning or selecting different subsets of training data for interpretability. This paper presents an efficient provenance-based approach, PrIU, and its optimized version, PrIU-opt, for incrementally updating model parameters without sacrificing prediction accuracy. We prove the correctness and convergence of the incrementally updated model parameters, and validate it experimentally. Experimental results show that up to two orders of magnitude speed-ups can be achieved by PrIU-opt compared to simply retraining the model from scratch, yet obtaining highly similar models.



There are no comments yet.


page 1

page 2

page 3

page 4


Learning a Behavior Model of Hybrid Systems Through Combining Model-Based Testing and Machine Learning (Full Version)

Models play an essential role in the design process of cyber-physical sy...

Subset Sampling For Progressive Neural Network Learning

Progressive Neural Network Learning is a class of algorithms that increm...

A fast online cascaded regression algorithm for face alignment

Traditional face alignment based on machine learning usually tracks the ...

SSSE: Efficiently Erasing Samples from Trained Machine Learning Models

The availability of large amounts of user-provided data has been key to ...

DART: Data Addition and Removal Trees

How can we update data for a machine learning model after it has already...

Incremental Calibration of Architectural Performance Models with Parametric Dependencies

Architecture-based Performance Prediction (AbPP) allows evaluation of th...

Processing Analytical Workloads Incrementally

Analysis of large data collections using popular machine learning and st...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.