DLHub: Model and Data Serving for Science

11/27/2018
by   Ryan Chard, et al.
0

While the Machine Learning (ML) landscape is evolving rapidly, there has been a relative lag in the development of the "learning systems" needed to enable broad adoption. Furthermore, few such systems are designed to support the specialized requirements of scientific ML. Here we present the Data and Learning Hub for science (DLHub), a multi-tenant system that provides both model repository and serving capabilities with a focus on science applications. DLHub addresses two significant shortcomings in current systems. First, its selfservice model repository allows users to share, publish, verify, reproduce, and reuse models, and addresses concerns related to model reproducibility by packaging and distributing models and all constituent components. Second, it implements scalable and low-latency serving capabilities that can leverage parallel and distributed computing resources to democratize access to published models through a simple web interface. Unlike other model serving frameworks, DLHub can store and serve any Python 3-compatible model or processing function, plus multiple-function pipelines. We show that relative to other model serving systems including TensorFlow Serving, SageMaker, and Clipper, DLHub provides greater capabilities, comparable performance without memoization and batching, and significantly better performance when the latter two techniques can be employed. We also describe early uses of DLHub for scientific applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2020

Optimizing Prediction Serving on Low-Latency Serverless Dataflow

Prediction serving systems are designed to provide large volumes of low-...
research
12/17/2017

TensorFlow-Serving: Flexible, High-Performance ML Serving

We describe TensorFlow-Serving, a system to serve machine learning model...
research
10/20/2018

MMLSpark: Unifying Machine Learning Ecosystems at Massive Scales

We introduce Microsoft Machine Learning for Apache Spark (MMLSpark), an ...
research
03/04/2021

Serverless Model Serving for Data Science

Machine learning (ML) is an important part of modern data science applic...
research
02/29/2020

FlexServe: Deployment of PyTorch Models as Flexible REST Endpoints

The integration of artificial intelligence capabilities into modern soft...
research
05/07/2020

funcX: A Federated Function Serving Fabric for Science

Exploding data volumes and velocities, new computational methods and pla...
research
10/14/2018

PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems

Machine Learning models are often composed of pipelines of transformatio...

Please sign up or login with your details

Forgot password? Click here to reset