Multi-agent Reinforcement Learning in Bayesian Stackelberg Markov Games for Adaptive Moving Target Defense

07/20/2020
by   Sailik Sengupta, et al.
22

The field of cybersecurity has mostly been a cat-and-mouse game with the discovery of new attacks leading the way. To take away an attacker's advantage of reconnaissance, researchers have proposed proactive defense methods such as Moving Target Defense (MTD). To find good movement strategies, researchers have modeled MTD as leader-follower games between the defender and a cyber-adversary. We argue that existing models are inadequate in sequential settings when there is incomplete information about a rational adversary and yield sub-optimal movement strategies. Further, while there exists an array of work on learning defense policies in sequential settings for cyber-security, they are either unpopular due to scalability issues arising out of incomplete information or tend to ignore the strategic nature of the adversary simplifying the scenario to use single-agent reinforcement learning techniques. To address these concerns, we propose (1) a unifying game-theoretic model, called the Bayesian Stackelberg Markov Games (BSMGs), that can model uncertainty over attacker types and the nuances of an MTD system and (2) a Bayesian Strong Stackelberg Q-learning (BSS-Q) approach that can, via interaction, learn the optimal movement policy for BSMGs within a reasonable time. We situate BSMGs in the landscape of incomplete-information Markov games and characterize the notion of Strong Stackelberg Equilibrium (SSE) in them. We show that our learning approach converges to an SSE of a BSMG and then highlight that the learned movement policy (1) improves the state-of-the-art in MTD for web-application security and (2) converges to an optimal policy in MTD domains with incomplete information about adversaries even when prior information about rewards and transitions is absent.

READ FULL TEXT

page 8

page 16

research
09/06/2018

Adaptive Strategic Cyber Defense for Advanced Persistent Threats in Critical Infrastructure Networks

Advanced Persistent Threats (APTs) have created new security challenges ...
research
04/03/2023

Learning About Simulated Adversaries from Human Defenders using Interactive Cyber-Defense Games

Given the increase in cybercrime, cybersecurity analysts (i.e. Defenders...
research
11/01/2018

Adaptive MTD Security using Markov Game Modeling

Large scale cloud networks consist of distributed networking and computi...
research
11/18/2022

Provable Defense against Backdoor Policies in Reinforcement Learning

We propose a provable defense mechanism against backdoor policies in rei...
research
07/01/2019

Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense

The increasing instances of advanced attacks call for a new defense para...
research
11/23/2021

Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the Age of AI-NIDS

Cyber attacks are increasing in volume, frequency, and complexity. In re...
research
09/27/2021

Learning Attacker's Bounded Rationality Model in Security Games

The paper proposes a novel neuroevolutionary method (NESG) for calculati...

Please sign up or login with your details

Forgot password? Click here to reset