Model Stability with Continuous Data Updates

01/14/2022
by   Huiting Liu, et al.
0

In this paper, we study the "stability" of machine learning (ML) models within the context of larger, complex NLP systems with continuous training data updates. For this study, we propose a methodology for the assessment of model stability (which we refer to as jitter under various experimental conditions. We find that model design choices, including network architecture and input representation, have a critical impact on stability through experiments on four text classification tasks and two sequence labeling tasks. In classification tasks, non-RNN-based models are observed to be more stable than RNN-based ones, while the encoder-decoder model is less stable in sequence labeling tasks. Moreover, input representations based on pre-trained fastText embeddings contribute to more stability than other choices. We also show that two learning strategies – ensemble models and incremental training – have a significant influence on stability. We recommend ML model designers account for trade-offs in accuracy and jitter when making modeling choices.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2023

An Exploration of Encoder-Decoder Approaches to Multi-Label Classification for Legal and Biomedical Text

Standard methods for multi-label text classification largely rely on enc...
research
09/15/2022

A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems

The traditional Machine Learning (ML) methodology requires to fragment t...
research
02/13/2023

Bag of Tricks for In-Distribution Calibration of Pretrained Transformers

While pre-trained language models (PLMs) have become a de-facto standard...
research
05/03/2021

Learning by Design: Structuring and Documenting the Human Choices in Machine Learning Development

The influence of machine learning (ML) is quickly spreading, and a numbe...
research
01/17/2023

Which Model Shall I Choose? Cost/Quality Trade-offs for Text Classification Tasks

Industry practitioners always face the problem of choosing the appropria...
research
08/28/2023

Matbench Discovery – An evaluation framework for machine learning crystal stability prediction

Matbench Discovery simulates the deployment of machine learning (ML) ene...
research
09/16/2021

MOFSimplify: Machine Learning Models with Extracted Stability Data of Three Thousand Metal-Organic Frameworks

We report a workflow and the output of a natural language processing (NL...

Please sign up or login with your details

Forgot password? Click here to reset