Dropbear: Machine Learning Marketplaces made Trustworthy with Byzantine Model Agreement

05/31/2022
by   Alex Shamis, et al.
0

Marketplaces for machine learning (ML) models are emerging as a way for organizations to monetize models. They allow model owners to retain control over hosted models by using cloud resources to execute ML inference requests for a fee, preserving model confidentiality. Clients that rely on hosted models require trustworthy inference results, even when models are managed by third parties. While the resilience and robustness of inference results can be improved by combining multiple independent models, such support is unavailable in today's marketplaces. We describe Dropbear, the first ML model marketplace that provides clients with strong integrity guarantees by combining results from multiple models in a trustworthy fashion. Dropbear replicates inference computation across a model group, which consists of multiple cloud-based GPU nodes belonging to different model owners. Clients receive inference certificates that prove agreement using a Byzantine consensus protocol, even under model heterogeneity and concurrent model updates. To improve performance, Dropbear batches inference and consensus operations separately: it first performs the inference computation across a model group, before ordering requests and model updates. Despite its strong integrity guarantees, Dropbear's performance matches that of state-of-the-art ML inference systems: deployed across 3 cloud sites, it handles 800 requests/s with ImageNet models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2021

Egalitarian Byzantine Fault Tolerance

Minimizing end-to-end latency in geo-replicated systems usually makes it...
research
05/06/2021

Towards Inference Delivery Networks: Distributing Machine Learning with Optimality Guarantees

We present the novel idea of inference delivery networks (IDN), networks...
research
10/12/2020

Garfield: System Support for Byzantine Machine Learning

Byzantine Machine Learning (ML) systems are nowadays vulnerable for they...
research
04/27/2019

Collage Inference: Tolerating Stragglers in Distributed Neural Network Inference using Coding

MLaaS (ML-as-a-Service) offerings by cloud computing platforms are becom...
research
05/05/2019

SGD: Decentralized Byzantine Resilience

The size of the datasets available today leads to distribute Machine Lea...
research
10/11/2019

Extraction of Complex DNN Models: Real Threat or Boogeyman?

Recently, machine learning (ML) has introduced advanced solutions to man...
research
09/21/2020

Resilient Cloud-based Replication with Low Latency

Existing approaches to tolerate Byzantine faults in geo-replicated envir...

Please sign up or login with your details

Forgot password? Click here to reset