Playing Adaptively Against Stealthy Opponents: A Reinforcement Learning Strategy for the FlipIt Security Game

06/27/2019

∙

A rise in Advanced Persistant Threats (APTs) has introduced a need for robustness against long-running, stealthy attacks which circumvent existing cryptographic security guarantees. FlipIt is a security game that models the attacker-defender interactions in advanced scenarios such as APTs. Previous work analyzed extensively non-adaptive strategies in FlipIt, but adaptive strategies rise naturally in practical interactions as players receive feedback during the game. We model the FlipIt game as a Markov Decision Process and use reinforcement learning algorithms to design adaptive strategies. We prove theoretical results on the convergence of our new strategy against an opponent playing with a Periodic strategy. We confirm our analysis experimentally by extensive evaluation of the strategy against specific opponents. Our strategies converge to the optimal adaptive strategy for Periodic and Exponential opponents. Finally, we introduce a generalized Q-Learning strategy with composite states that outperforms a Greedy-based strategy for several distributions, including Periodic and Uniform, without prior knowledge of the opponent's strategy.

READ FULL TEXT

Playing Adaptively Against Stealthy Opponents: A Reinforcement Learning Strategy for the FlipIt Security Game

Sign in with Google

Consider DeepAI Pro