Scaling TensorFlow to 300 million predictions per second

09/20/2021
by   Jan Hartman, et al.
0

We present the process of transitioning machine learning models to the TensorFlow framework at a large scale in an online advertising ecosystem. In this talk we address the key challenges we faced and describe how we successfully tackled them; notably, implementing the models in TF and serving them efficiently with low latency using various optimization techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/16/2019

TensorFlow.js: Machine Learning for the Web and Beyond

TensorFlow.js is a library for building and executing machine learning a...
research
05/27/2021

TensorFlow RiemOpt: a library for optimization on Riemannian manifolds

The adoption of neural networks and deep learning in non-Euclidean domai...
research
04/17/2020

DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications

One of the limitations of deep learning models with sparse features toda...
research
12/17/2017

TensorFlow-Serving: Flexible, High-Performance ML Serving

We describe TensorFlow-Serving, a system to serve machine learning model...
research
05/02/2019

Parity Models: A General Framework for Coding-Based Resilience in ML Inference

Machine learning models are becoming the primary workhorses for many app...
research
07/02/2021

An Experience Report on Machine Learning Reproducibility: Guidance for Practitioners and TensorFlow Model Garden Contributors

Machine learning techniques are becoming a fundamental tool for scientif...
research
04/25/2019

Declarative Recursive Computation on an RDBMS, or, Why You Should Use a Database For Distributed Machine Learning

A number of popular systems, most notably Google's TensorFlow, have been...

Please sign up or login with your details

Forgot password? Click here to reset