Sequential Learning-based IaaS Composition
We propose a novel IaaS composition framework that selects an optimal set of consumer requests according to the provider's qualitative preferences on long-term service provisions. Decision variables are included in the temporal conditional preference networks (TempCP-net) to represent qualitative preferences for both short-term and long-term consumers. The global preference ranking of a set of requests is computed using a k-d tree indexing based temporal similarity measure approach. We propose an extended three-dimensional Q-learning approach to maximize the global preference ranking. We design the on-policy based sequential selection learning approach that applies the length of request to accept or reject requests in a composition. The proposed on-policy based learning method reuses historical experiences or policies of sequential optimization using an agglomerative clustering approach. Experimental results prove the feasibility of the proposed framework.
READ FULL TEXT