An Experimental and Comparative Benchmark Study Examining Resource Utilization in Managed Hadoop Context

12/19/2021
by   Uluer Emre Ozdil, et al.
0

Transitioning cloud-based Hadoop from IaaS to PaaS, which are commercially conceptualized as pay-as-you-go or pay-per-use, often reduces the associated system costs. However, managed Hadoop systems do present a black-box behavior to the end-users who cannot be clear on the inner performance dynamics, hence, on the benefits of leveraging them. In the study, we aimed to understand managed Hadoop context in terms of resource utilization. We utilized three experimental Hadoop-on-PaaS proposals as they come out-of-the-box and conducted Hadoop specific workloads of the HiBench Benchmark Suite. During the benchmark executions, we collected system resource utilization data on the worker nodes. The results indicated that the same property specifications among cloud services do not guarantee nearby performance outputs, nor consistent results within themselves. We assume that the managed systems' architectures and pre-configurations play a significant role in the performance.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset