Evolving Reinforcement Learning Environment to Minimize Learner's Achievable Reward: An Application on Hardening Active Directory Systems

04/08/2023
by   Diksha Goel, et al.
0

We study a Stackelberg game between one attacker and one defender in a configurable environment. The defender picks a specific environment configuration. The attacker observes the configuration and attacks via Reinforcement Learning (RL trained against the observed environment). The defender's goal is to find the environment with minimum achievable reward for the attacker. We apply Evolutionary Diversity Optimization (EDO) to generate diverse population of environments for training. Environments with clearly high rewards are killed off and replaced by new offsprings to avoid wasting training time. Diversity not only improves training quality but also fits well with our RL scenario: RL agents tend to improve gradually, so a slightly worse environment earlier on may become better later. We demonstrate the effectiveness of our approach by focusing on a specific application, Active Directory (AD). AD is the default security management system for Windows domain networks. AD environment describes an attack graph, where nodes represent computers/accounts/etc., and edges represent accesses. The attacker aims to find the best attack path to reach the highest-privilege node. The defender can change the graph by removing a limited number of edges (revoke accesses). Our approach generates better defensive plans than the existing approach and scales better.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/02/2022

Scalable Edge Blocking Algorithms for Defending Active Directory Style Attack Graphs

Active Directory (AD) is the default security management system for Wind...
research
04/07/2022

Defending Active Directory by Combining Neural Network based Dynamic Program and Evolutionary Diversity Optimisation

Active Directory (AD) is the default security management system for Wind...
research
03/28/2020

Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning

We study a security threat to reinforcement learning where an attacker p...
research
03/27/2020

Adaptive Reward-Poisoning Attacks against Reinforcement Learning

In reward-poisoning attacks against reinforcement learning (RL), an atta...
research
07/29/2022

Sampling Attacks on Meta Reinforcement Learning: A Minimax Formulation and Complexity Analysis

Meta reinforcement learning (meta RL), as a combination of meta-learning...
research
10/13/2019

Policy Poisoning in Batch Reinforcement Learning and Control

We study a security threat to batch reinforcement learning and control w...
research
03/11/2022

Reinforcement Learning for Linear Quadratic Control is Vulnerable Under Cost Manipulation

In this work, we study the deception of a Linear-Quadratic-Gaussian (LQG...

Please sign up or login with your details

Forgot password? Click here to reset