
Tight Lower Bound for Average Number of Terms in Optimal Doublebase Number System
We show in this note that the average number of terms in the optimal dou...
read it

Analysis of Lower Bounds for Simple Policy Iteration
Policy iteration is a family of algorithms that are used to find an opti...
read it

Optimal Uncertaintyguided Neural Network Training
The neural network (NN)based direct uncertainty quantification (UQ) met...
read it

Learning Stochastic Shortest Path with Linear Function Approximation
We study the stochastic shortest path (SSP) problem in reinforcement lea...
read it

Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation
We propose two algorithms for episodic stochastic shortest path problems...
read it

Representation Balancing MDPs for OffPolicy Policy Evaluation
We study the problem of offpolicy policy evaluation (OPPE) in RL. In co...
read it

Optimal Hierarchical Signaling for Quadratic Cost Measures and General Distributions: A Copositive Program Characterization
In this paper, we address the problem of optimal hierarchical signaling ...
read it
Discretized Approximations for POMDP with Average Cost
In this paper, we propose a new lower approximation scheme for POMDP with discounted and average cost criterion. The approximating functions are determined by their values at a finite number of belief points, and can be computed efficiently using value iteration algorithms for finitestate MDP. While for discounted problems several lower approximation schemes have been proposed earlier, ours seems the first of its kind for average cost problems. We focus primarily on the average cost case, and we show that the corresponding approximation can be computed efficiently using multichain algorithms for finitestate MDP. We give a preliminary analysis showing that regardless of the existence of the optimal average cost J in the POMDP, the approximation obtained is a lower bound of the liminf optimal average cost function, and can also be used to calculate an upper bound on the limsup optimal average cost function, as well as bounds on the cost of executing the stationary policy associated with the approximation. Weshow the convergence of the cost approximation, when the optimal average cost is constant and the optimal differential cost is continuous.
READ FULL TEXT
Comments
There are no comments yet.