On the Throughput Optimization in Large-Scale Batch-Processing Systems

09/20/2020 ∙ by Sounak Kar, et al. ∙ 0

We analyze a data-processing system with n clients producing jobs which are processed in batches by m parallel servers; the system throughput critically depends on the batch size and a corresponding sub-additive speedup function. In practice, throughput optimization relies on numerical searches for the optimal batch size, a process that can take up to multiple days in existing commercial systems. In this paper, we model the system in terms of a closed queueing network; a standard Markovian analysis yields the optimal throughput in ω(n^4) time. Our main contribution is a mean-field model of the system for the regime where the system size is large. We show that the mean-field model has a unique, globally attractive stationary point which can be found in closed form and which characterizes the asymptotic throughput of the system as a function of the batch size. Using this expression we find the asymptotically optimal throughput in O(1) time. Numerical settings from a large commercial system reveal that this asymptotic optimum is accurate in practical finite regimes.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.