What does it take to catch a Chinchilla? Verifying Rules on Large-Scale Neural Network Training via Compute Monitoring

03/20/2023
by   Yonadav Shavit, et al.
0

As advanced machine learning systems' capabilities begin to play a significant role in geopolitics and societal order, it may become imperative that (1) governments be able to enforce rules on the development of advanced ML systems within their borders, and (2) countries be able to verify each other's compliance with potential future international agreements on advanced ML development. This work analyzes one mechanism to achieve this, by monitoring the computing hardware used for large-scale NN training. The framework's primary goal is to provide governments high confidence that no actor uses large quantities of specialized ML chips to execute a training run in violation of agreed rules. At the same time, the system does not curtail the use of consumer computing devices, and maintains the privacy and confidentiality of ML practitioners' models, data, and hyperparameters. The system consists of interventions at three stages: (1) using on-chip firmware to occasionally save snapshots of the the neural network weights stored in device memory, in a form that an inspector could later retrieve; (2) saving sufficient information about each training run to prove to inspectors the details of the training run that had resulted in the snapshotted weights; and (3) monitoring the chip supply chain to ensure that no actor can avoid discovery by amassing a large quantity of un-tracked chips. The proposed design decomposes the ML training rule verification problem into a series of narrow technical challenges, including a new variant of the Proof-of-Learning problem [Jia et al. '21].

READ FULL TEXT
research
12/08/2014

MLitB: Machine Learning in the Browser

With few exceptions, the field of Machine Learning (ML) research has lar...
research
11/17/2021

ReaLPrune: ReRAM Crossbar-aware Lottery Ticket Pruned CNNs

Training machine learning (ML) models at the edge (on-chip training on e...
research
01/31/2023

An investigation of challenges encountered when specifying training data and runtime monitors for safety critical ML applications

Context and motivation: The development and operation of critical softwa...
research
02/11/2022

Compute Trends Across Three Eras of Machine Learning

Compute, data, and algorithmic advances are the three fundamental factor...
research
09/03/2023

FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs

The rapid growth of memory and computation requirements of large languag...
research
01/19/2022

Tiny, always-on and fragile: Bias propagation through design choices in on-device machine learning workflows

Billions of distributed, heterogeneous and resource constrained smart co...

Please sign up or login with your details

Forgot password? Click here to reset