Accelerated Reinforcement Learning Algorithms with Nonparametric Function Approximation for Opportunistic Spectrum Access

06/14/2017
by   Theodoros Tsiligkaridis, et al.
0

We study the problem of throughput maximization by predicting spectrum opportunities using reinforcement learning. Our kernel-based reinforcement learning approach is coupled with a sparsification technique that efficiently captures the environment states to control dimensionality and finds the best possible channel access actions based on the current state. This approach allows learning and planning over the intrinsic state-action space and extends well to large state and action spaces. For stationary Markov environments, we derive the optimal policy for channel access, its associated limiting throughput, and propose a fast online algorithm for achieving the optimal throughput. We then show that the maximum-likelihood channel prediction and access algorithm is suboptimal in general, and derive conditions under which the two algorithms are equivalent. For reactive Markov environments, we derive kernel variants of Q-learning, R-learning and propose an accelerated R-learning algorithm that achieves faster convergence. We finally test our algorithms against a generic reactive network. Simulation results are shown to validate the theory and show the performance gains over current state-of-the-art techniques.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset