Adaptive Honeypot Engagement through Reinforcement Learning of Semi-Markov Decision Processes

06/27/2019
by   Linan Huang, et al.
0

The honeynet is a promising active cyber defense mechanism. It reveals the fundamental Indicators of Compromise (IoC) by luring attackers to conduct adversarial behaviors in a controlled and monitored environment. The active interaction at the honeynet brings a high reward but also introduces high implementation costs and risks of adversarial honeynet exploitation. In this work, we apply the infinite-horizon Semi-Markov Decision Process (SMDP) to characterize the stochastic transition and sojourn time of attackers in the honeynet and quantify the reward-risk trade-off. In particular, we produce adaptive long-term engagement policies shown to be risk-averse, cost-effective, and time-efficient. Numerical results have demonstrated that our adaptive interaction policies can quickly attract attackers to the target honeypot and engage them for a sufficiently long period to obtain worthy threat information. Meanwhile, the penetration probability is kept at a low level. The results show that the expected utility is robust against attackers of a large range of persistence and intelligence. Finally, we apply reinforcement learning to SMDP to solve the curse of modeling. Under a prudent choice of the learning rate and exploration policy, we achieve a quick and robust convergence of the optimal policy and value.

READ FULL TEXT
research
09/12/2022

Deterministic Sequencing of Exploration and Exploitation for Reinforcement Learning

We propose Deterministic Sequencing of Exploration and Exploitation (DSE...
research
06/30/2019

Detecting Spiky Corruption in Markov Decision Processes

Current reinforcement learning methods fail if the reward function is im...
research
02/28/2017

Analysis of Agent Expertise in Ms. Pac-Man using Value-of-Information-based Policies

Conventional reinforcement learning methods for Markov decision processe...
research
01/29/2018

Using deep Q-learning to understand the tax evasion behavior of risk-averse firms

Designing tax policies that are effective in curbing tax evasion and max...
research
07/01/2019

Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense

The increasing instances of advanced attacks call for a new defense para...
research
07/09/2019

Variance-Based Risk Estimations in Markov Processes via Transformation with State Lumping

Variance plays a crucial role in risk-sensitive reinforcement learning, ...
research
09/23/2021

Evaluating Attacker Risk Behavior in an Internet of Things Ecosystem

In cybersecurity, attackers range from brash, unsophisticated script kid...

Please sign up or login with your details

Forgot password? Click here to reset