DeepAI AI Chat
Log In Sign Up

Two stage cluster for resource optimization with Apache Mesos

by   Gourav Rattihalli, et al.

As resource estimation for jobs is difficult, users often overestimate their requirements. Both commercial clouds and academic campus clusters suffer from low resource utilization and long wait times as the resource estimates for jobs, provided by users, is inaccurate. We present an approach to statistically estimate the actual resource requirement of a job in a Little cluster before the run in a Big cluster. The initial estimation on the little cluster gives us a view of how much actual resources a job requires. This initial estimate allows us to accurately allocate resources for the pending jobs in the queue and thereby improve throughput and resource utilization. In our experiments, we determined resource utilization estimates with an average accuracy of 90 memory and 94 average of 22 on Apache Aurora and Apache Mesos.


page 3

page 5

page 6

page 7


Hugo: A Cluster Scheduler that Efficiently Learns to Select Complementary Data-Parallel Jobs

Distributed data processing systems like MapReduce, Spark, and Flink are...

DLRover: An Elastic Deep Training Extension with Auto Job Resource Recommendation

The cloud is still a popular platform for distributed deep learning (DL)...

Maximizing Utilization under Time-Varying Resource Requirements

Low utilization has been one of the key limiting factors for the continu...

MultiCloud Resource Management using Apache Mesos with Apache Airavata

We discuss initial results and our planned approach for incorporating Ap...

Do the Hard Stuff First: Scheduling Dependent Computations in Data-Analytics Clusters

We present a scheduler that improves cluster utilization and job complet...

Henge: Intent-driven Multi-Tenant Stream Processing

We present Henge, a system to support intent-based multi-tenancy in mode...

Improving the Effective Utilization of Supercomputer Resources by Adding Low-Priority Containerized Jobs

We propose an approach to utilize idle computational resources of superc...