ModelHub: Towards Unified Data and Lifecycle Management for Deep Learning

11/18/2016
by   Hui Miao, et al.
0

Deep learning has improved state-of-the-art results in many important fields, and has been the subject of much research in recent years, leading to the development of several systems for facilitating deep learning. Current systems, however, mainly focus on model building and training phases, while the issues of data management, model sharing, and lifecycle management are largely ignored. Deep learning modeling lifecycle generates a rich set of data artifacts, such as learned parameters and training logs, and comprises of several frequently conducted tasks, e.g., to understand the model behaviors and to try out new models. Dealing with such artifacts and tasks is cumbersome and largely left to the users. This paper describes our vision and implementation of a data and lifecycle management system for deep learning. First, we generalize model exploration and model enumeration queries from commonly conducted tasks by deep learning modelers, and propose a high-level domain specific language (DSL), inspired by SQL, to raise the abstraction level and accelerate the modeling process. To manage the data artifacts, especially the large amount of checkpointed float parameters, we design a novel model versioning system (dlv), and a read-optimized parameter archival storage system (PAS) that minimizes storage footprint and accelerates query workloads without losing accuracy. PAS archives versioned models using deltas in a multi-resolution fashion by separately storing the less significant bits, and features a novel progressive query (inference) evaluation algorithm. Third, we show that archiving versioned models using deltas poses a new dataset versioning problem and we develop efficient algorithms for solving it. We conduct extensive experiments over several real datasets from computer vision domain to show the efficiency of the proposed techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2020

Sampling for Deep Learning Model Diagnosis (Technical Report)

Deep learning (DL) models have achieved paradigm-changing performance in...
research
12/18/2018

DeepLens: Towards a Visual Data Management System

Advances in deep learning have greatly widened the scope of automatic co...
research
08/04/2022

Standardizing and Centralizing Datasets to Enable Efficient Training of Agricultural Deep Learning Models

In recent years, deep learning models have become the standard for agric...
research
04/13/2023

Generalizable Deep Learning Method for Suppressing Unseen and Multiple MRI Artifacts Using Meta-learning

Magnetic Resonance (MR) images suffer from various types of artifacts du...
research
01/25/2022

Serving Deep Learning Models with Deduplication from Relational Databases

There are significant benefits to serve deep learning models from relati...
research
07/18/2023

Towards Sustainable Deep Learning for Multi-Label Classification on NILM

Non-intrusive load monitoring (NILM) is the process of obtaining applian...
research
11/17/2017

Scalable Recollections for Continual Lifelong Learning

Given the recent success of Deep Learning applied to a variety of single...

Please sign up or login with your details

Forgot password? Click here to reset