DeepAI
Log In Sign Up

Model Stability with Continuous Data Updates

01/14/2022
by   Huiting Liu, et al.
0

In this paper, we study the "stability" of machine learning (ML) models within the context of larger, complex NLP systems with continuous training data updates. For this study, we propose a methodology for the assessment of model stability (which we refer to as jitter under various experimental conditions. We find that model design choices, including network architecture and input representation, have a critical impact on stability through experiments on four text classification tasks and two sequence labeling tasks. In classification tasks, non-RNN-based models are observed to be more stable than RNN-based ones, while the encoder-decoder model is less stable in sequence labeling tasks. Moreover, input representations based on pre-trained fastText embeddings contribute to more stability than other choices. We also show that two learning strategies – ensemble models and incremental training – have a significant influence on stability. We recommend ML model designers account for trade-offs in accuracy and jitter when making modeling choices.

READ FULL TEXT

page 1

page 2

page 3

page 4

09/15/2022

A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems

The traditional Machine Learning (ML) methodology requires to fragment t...
10/21/2022

Performance-Efficiency Trade-Offs in Adapting Language Models to Text Classification Tasks

Pre-trained language models (LMs) obtain state-of-the-art performance wh...
05/03/2021

Learning by Design: Structuring and Documenting the Human Choices in Machine Learning Development

The influence of machine learning (ML) is quickly spreading, and a numbe...
01/17/2023

Which Model Shall I Choose? Cost/Quality Trade-offs for Text Classification Tasks

Industry practitioners always face the problem of choosing the appropria...
03/07/2018

Transfer Automatic Machine Learning

Building effective neural networks requires many design choices. These i...
09/16/2021

MOFSimplify: Machine Learning Models with Extracted Stability Data of Three Thousand Metal-Organic Frameworks

We report a workflow and the output of a natural language processing (NL...
06/01/2020

Concept Matching for Low-Resource Classification

We propose a model to tackle classification tasks in the presence of ver...