Model Provenance via Model DNA

08/04/2023
by   Xin Mu, et al.
0

Understanding the life cycle of the machine learning (ML) model is an intriguing area of research (e.g., understanding where the model comes from, how it is trained, and how it is used). This paper focuses on a novel problem within this field, namely Model Provenance (MP), which concerns the relationship between a target model and its pre-training model and aims to determine whether a source model serves as the provenance for a target model. This is an important problem that has significant implications for ensuring the security and intellectual property of machine learning models but has not received much attention in the literature. To fill in this gap, we introduce a novel concept of Model DNA which represents the unique characteristics of a machine learning model. We utilize a data-driven and model-driven representation learning method to encode the model's training data and input-output information as a compact and comprehensive representation (i.e., DNA) of the model. Using this model DNA, we develop an efficient framework for model provenance identification, which enables us to identify whether a source model is a pre-training model of a target model. We conduct evaluations on both computer vision and natural language processing tasks using various models, datasets, and scenarios to demonstrate the effectiveness of our approach in accurately identifying model provenance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/19/2020

Parrot: Data-Driven Behavioral Priors for Reinforcement Learning

Reinforcement learning provides a general framework for flexible decisio...
research
05/08/2023

Data-Driven Bee Identification for DNA Strands

We study a data-driven approach to the bee identification problem for DN...
research
04/01/2021

CUPID: Adaptive Curation of Pre-training Data for Video-and-Language Representation Learning

This work concerns video-language pre-training and representation learni...
research
07/11/2023

DNAGPT: A Generalized Pretrained Tool for Multiple DNA Sequence Analysis Tasks

The success of the GPT series proves that GPT can extract general inform...
research
02/20/2018

Learning to Abstain via Curve Optimization

In practical applications of machine learning, it is often desirable to ...
research
06/24/2019

Statistical Learning Machines from ATR to DNA Microarrays: design, assessment, and advice for practitioners

Statistical Learning is the process of estimating an unknown probabilist...
research
06/24/2019

A Review of Statistical Learning Machines from ATR to DNA Microarrays: design, assessment, and advice for practitioners

Statistical Learning is the process of estimating an unknown probabilist...

Please sign up or login with your details

Forgot password? Click here to reset