Using Python for Model Inference in Deep Learning

by   Zachary DeVito, et al.

Python has become the de-facto language for training deep neural networks, coupling a large suite of scientific computing libraries with efficient libraries for tensor computation such as PyTorch or TensorFlow. However, when models are used for inference they are typically extracted from Python as TensorFlow graphs or TorchScript programs in order to meet performance and packaging constraints. The extraction process can be time consuming, impeding fast prototyping. We show how it is possible to meet these performance and packaging constraints while performing inference in Python. In particular, we present a way of using multiple Python interpreters within a single process to achieve scalable inference and describe a new container format for models that contains both native Python code and data. This approach simplifies the model deployment story by eliminating the model extraction step, and makes it easier to integrate existing performance-enhancing Python libraries. We evaluate our design on a suite of popular PyTorch models on Github, showing how they can be packaged in our inference format, and comparing their performance to TorchScript. For larger models, our packaged Python models perform the same as TorchScript, and for smaller models where there is some Python overhead, our multi-interpreter approach ensures inference is still scalable.


page 1

page 2

page 3

page 4


Toward Efficient Interactions between Python and Native Libraries

Python has become a popular programming language because of its excellen...

Comparing neural network training performance between Elixir and Python

With a wide range of libraries focused on the machine learning market, s...

The Awkward World of Python and C++

There are undeniable benefits of binding Python and C++ to take advantag...

PyTracer: Automatically profiling numerical instabilities in Python

Numerical stability is a crucial requirement of reliable scientific comp...

A Fortran-Keras Deep Learning Bridge for Scientific Computing

Implementing artificial neural networks is commonly achieved via high-le...

Gradual Soundness: Lessons from Static Python

Context: Gradually-typed languages allow typed and untyped code to inter...

Performance Evaluation of Python Parallel Programming Models: Charm4Py and mpi4py

Python is rapidly becoming the lingua franca of machine learning and sci...

Please sign up or login with your details

Forgot password? Click here to reset