Runtime QoS service for application-driven adaptation in network computing
A distributed application executing on a Network of Workstations (NOW) needs to be resource state aware to possibly adapt itself accordingly in order to keep satisfying the desired Quality of Service (QoS) demands throughout its lifespan. We implemented a QoS service to enable application-driven adaptation for performance and fault tolerance at runtime. The service is associated with lightweight middleware that monitors the state and load of all application entities (e.g., machines, tasks, and logical network links). Moreover, it makes its services available to an application task via an anonymous and simple to use QoS API. We present a Manager-Worker application that uses our fault tolerance QoS API to adapt for Worker faults in order to avoid application deadlock at runtime. Moreover, we show how a dynamic application-level scheduler can easily utilize the QoS API to find efficient schedules. Furthermore, we quantified the overhead of the QoS middleware in various scenarios to demonstrate that it has minor impact on the performance of the application it is servicing.
READ FULL TEXT