Deceptive Kernel Function on Observations of Discrete POMDP

08/12/2020
by   Zhili Zhang, et al.
0

This paper studies the deception applied on agent in a partially observable Markov decision process. We introduce deceptive kernel function (the kernel) applied to agent's observations in a discrete POMDP. Based on value iteration, value function approximation and POMCP three characteristic algorithms used by agent, we analyze its belief being misled by falsified observations as the kernel's outputs and anticipate its probable threat on agent's reward and potentially other performance. We validate our expectation and explore more detrimental effects of the deception by experimenting on two POMDP problems. The result shows that the kernel applied on agent's observation can affect its belief and substantially lower its resulting rewards; meantime certain implementation of the kernel could induce other abnormal behaviors by the agent.

READ FULL TEXT

page 15

page 20

research
10/01/2018

Bayesian Policy Optimization for Model Uncertainty

Addressing uncertainty is critical for autonomous systems to robustly ad...
research
09/10/2021

Simultaneous Perception-Action Design via Invariant Finite Belief Sets

Although perception is an increasingly dominant portion of the overall c...
research
10/25/2022

Sequential Decision Making on Unmatched Data using Bayesian Kernel Embeddings

The problem of sequentially maximizing the expectation of a function see...
research
07/24/2022

Towards Using Fully Observable Policies for POMDPs

Partially Observable Markov Decision Process (POMDP) is a framework appl...
research
03/06/2023

The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models

Partially Observable Markov Decision Processes (POMDPs) are useful tools...
research
01/21/2022

Under-Approximating Expected Total Rewards in POMDPs

We consider the problem: is the optimal expected total reward to reach a...
research
02/15/2019

Robust Reinforcement Learning in POMDPs with Incomplete and Noisy Observations

In real-world scenarios, the observation data for reinforcement learning...

Please sign up or login with your details

Forgot password? Click here to reset