DeepAI
Log In Sign Up

A risk-security tradeoff in graphical coordination games

A system relying on the collective behavior of decision-makers can be vulnerable to a variety of adversarial attacks. How well can a system operator protect performance in the face of these risks? We frame this question in the context of graphical coordination games, where the agents in a network choose among two conventions and derive benefits from coordinating neighbors, and system performance is measured in terms of the agents' welfare. In this paper, we assess an operator's ability to mitigate two types of adversarial attacks - 1) broad attacks, where the adversary incentivizes all agents in the network and 2) focused attacks, where the adversary can force a selected subset of the agents to commit to a prescribed convention. As a mitigation strategy, the system operator can implement a class of distributed algorithms that govern the agents' decision-making process. Our main contribution characterizes the operator's fundamental trade-off between security against worst-case broad attacks and vulnerability from focused attacks. We show that this tradeoff significantly improves when the operator selects a decision-making process at random. Our work highlights the design challenges a system operator faces in maintaining resilience of networked distributed systems.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

06/04/2019

Risk and security tradeoffs in graphical coordination games

A system whose operation relies on the collective behavior of a populati...
11/02/2017

Security Against Impersonation Attacks in Distributed Systems

In a multi-agent system, transitioning from a centralized to a distribut...
03/16/2020

Exploiting an Adversary's Intentions in Graphical Coordination Games

How does information regarding an adversary's intentions affect optimal ...
09/05/2019

The Impact of Complex and Informed Adversarial Behavior in Graphical Coordination Games

How does system-level information impact the ability of an adversary to ...
12/24/2021

A Triangular Fuzzy based Multicriteria Decision Making Approach for Assessing Security Risks in 5G Networks

The emerging 5G network is a new global wireless standard after 1G, 2G, ...
05/12/2022

Stalloris: RPKI Downgrade Attack

We demonstrate the first downgrade attacks against RPKI. The key design ...
05/19/2020

Adversarial Attacks for Embodied Agents

Adversarial attacks are valuable for providing insights into the blind-s...

1 Introduction

Networked distributed systems typically operate without centralized planning or control, and instead rely on local interactions and communication between the comprising agents. These systems arise in a variety of engineering applications such as teams of mobile robots and sensor networks [1, 2, 3]. They are also prevalent in social dynamics [4, 5] and biological populations [6].

The transition from a centralized to a distributed architecture may leave a system vulnerable to a variety of adversarial attacks. An adversary may be able to manipulate the decision-making processes of the agents. Such dynamical perturbations can potentially lead to unwanted outcomes. For example in social networks, individual opinions can be shaped from external information sources, resulting in a polarized society [7, 8]

. When feasible, a system operator takes measures to mitigate adversarial influences. The literature on cyber-physical system security studies many aspects of this interplay. For instance, optimal controllers are designed to mitigate denial-of-service, estimation, and deception attacks

[9, 10, 11, 12, 13].

This paper investigates security measures that a system operator can take against adversarial influences when the underlying system is a graphical coordination game [5, 14], where agents in a network decide between two choices, or . One may think of these choices as two competing products, e.g. iPhone vs Android, two conflicting social norms, or two opposing political parties. Each agent derives a positive benefit from interactions with coordinating neighbors, and zero benefits from mis-coordinating ones. The system’s efficiency is defined by the ratio of total benefits of all agents to the maximal attainable benefits over all configurations of choices.

The goal of the system operator is to design a local decision-making rule for each agent in the system so that the emergent collective behavior optimizes system efficiency. One algorithm that achieves this goal is known as log-linear learning [15, 16, 17]. More formally, the agents follow a “perturbed” best reply dynamics where the agents’ local objectives are precisely equal to their local welfare. We seek to address the question of whether this particular algorithm is robust to adversarial influences. That is, does this algorithm preserve system efficiency when the agents’ decision-making processes are manipulated by an adversary? If not, can the operator alter the agents’ local objectives to mitigate such attacks?

We consider two adversarial attack models - broad and focused attacks. In broad attacks, the adversary incentivizes every agent in the network (hence broad) with a convention, influencing their decision-making process. This could depict distributing political ads with the intention of polarizing voters. In focused attacks, the adversary targets a specific set of agents in the network, forcing them to commit to or . These targeted, or fixed agents consequently do not update their choices over time but still influence the decisions of others. For instance, they could portray loyal consumers of a brand or product, or staunch supporters of a political party. Fixed agents and their effects on system performance have been extensively studied in the context of opinion dynamics and optimization algorithms [18, 19, 13].

The first contribution of this paper is a characterization of worst-case risk metrics from both adversarial attacks as a function of the operator’s algorithm design parameter (Section 3). We define risk in this paper as the system’s distance to optimal efficiency. By worst-case here we mean the maximum risk among all connected network topologies subject to any admissible adversarial attack. Hence, our analysis identifies the network topologies on which worst-case risks are attained (Section 5). We extend this analysis to randomized operator strategies (Sections 4, 6).

The second contribution of this paper answers the question “if the operator succeeds in protecting the system from one type of attack, how vulnerable does it leave the system to the other?” We identify a fundamental tradeoff between security against broad attacks and risks from focused attacks. We then show randomized operator strategies significantly improves the set of attainable risk levels and their associated tradeoffs (Section 4).

By characterizing this interplay, we contribute to previous work that studied the impact of adversarial influence in graphical coordination games [20, 21, 22]. These works analyze worst-case damages that can be inflicted by varying degrees of adversarial sophistication and intelligence in the absence of a system operator. However, these results were derived only in specific graph structures, namely ring graphs, whereas our analysis considers adversarial influence in any graph topology.

2 Preliminaries

2.1 Graphical coordination games

A graphical coordination game is played between a set of agents over a connected undirected network with node set and edge set . Agent ’s set of neighbors is written as . Each agent selects a choice from its action set . The choices of all the agents constitutes an action profile , and we denote the set of all action profiles as . The local interaction between two agents is based on a matrix game, described by the payoff matrix ,

Player
Player
(1)

where is the system payoff gain. It indicates that is an inherently superior product over when users coordinate. Here, agents would rather coordinate than not, but prefer to coordinate on . Agent ’s benefit is the sum of payoffs derived from playing the game (1) with each of its network neighbors:

(2)

A measure of system welfare defined over is

(3)

which is simply the sum of all agent benefits. The system efficiency for action profile is defined as

(4)

For , the all- profile maximizes welfare. This does not necessarily hold for arbitrary action spaces.

2.2 Log-linear learning algorithm

Log-linear learning is a distributed stochastic algorithm governing how players’ decisions evolve over time [16, 14, 15]. It may be applied to any instance of a game with each player having a well-defined local utility function over a set of action profiles with an underlying interaction graph . That is, agent ’s local utility is a function of its action and actions of its neighbors in .

Agents update their decisions over discrete time steps . Assume is arbitrarily determined. For step , one agent is selected uniformly at random from the population. It updates its action to

with probability

(5)

where is the rationality parameter. All other agents repeat their previous actions: . For large values of , selects a best-response to the previous actions of others with high probability, and for values of near zero, randomizes among its actions

uniformly at random. This induces an irreducible Markov chain over the action space

, with a unique stationary distribution . The stochastically stable states (SSS) are the action profiles contained in the support of the stationary distribution in the high rationality limit: they satisfy . Such a limiting distribution exists and is unique [23, 15, 24]. We write the set of stochastically stable states as

(6)

For graphical coordination games, the log-linear learning algorithm specified by the action set and utilities selects the welfare-maximizing profile as the stochastically stable state irrespective of the graph topology . This can be shown using standard potential game arguments [16] (we provide these details in Section 5). That is, for all , where is the set of all connected undirected graphs on nodes.

However, if an adversary is able to manipulate the agents’ local decision-making rules, this statement may no longer hold true. A system operator may be able to alter the agents’ local utility functions with the goal of mitigating the loss of system efficiency in the presence of adversarial influences. In particular, we consider the class of local utility functions parameterized by . Specifically, takes the same form as the benefit function (2) where is replaced with a perceived gain that is under the operator’s control. We next introduce models of adversarial attacks in graphical coordination games. We then evaluate the performance of this class of distributed algorithms in the face of adversarial attacks.

(a)
(b)
Figure 1: (Left) An example three-node line network under a broad adversarial attack. The imposter nodes are depicted as the labelled smaller circles and agents in the network are the bigger circles. The color of each circle indicates the node’s action - green for , blue for . In this example, maximum welfare is , achieved when all three agents play . The adversary’s target set attaches an -imposter to node 1 and -imposters to nodes and . For operator gains , is the welfare-minimizing SSS, i.e. it satisfies . This gives a risk of . For , the welfare-minimizing SSS is . This gives optimal efficiency, i.e. a risk of . (Right) An example of a four node star network under a focused attack where a subset of three nodes are targeted to be fixed (squares). Only the center node is unfixed. In this example, the maximum welfare is , achieved when the center plays . This is because the alternative action (when center plays ) gives the suboptimal welfare due to . For operator gains , the center node plays in the SSS. This yields optimal efficiency, i.e. the risk is . For , the center node plays , giving a risk of . The methods to calculate stochastically stable states under both types of attacks follow standard potential game arguments and are detailed in Section 5.

3 Models of adversarial influence

In this section, we outline two models of adversarial attacks in graphical coordination games - broad and focused attacks. The system operator specifies the local utility functions that govern the log-linear learning algorithm by selecting the perceived payoff gain . Our goal is to assess the performance of this range of algorithms on two corresponding worst-case risk metrics, which we define and characterize. We then identify fundamental tradeoff relations between these two risk metrics.

3.1 Broad attacks and worst-case risk metric

We consider a scenario where the system is subject to broad adversarial attacks. For each agent in the network, the adversary attaches a single imposter node that acts as a neighbor that always plays or . These nodes are not members of the network but affect the decision making of agents that are. Let () be the set of agents targeted with an imposter () node. We call the target set . Any target set satisfies and . We call the set of all possible target sets on the graph . Given , the agents’ perceived utilities are

(7)
(a)
(b)
Figure 2: (a) The worst-case risk from broad attacks (11) is a piecewise constant function defined over countably infinite half-open intervals. The graphs and their corresponding target set which attain each level of worst-case broad risk are illustrated for . Here, the labels indicate the type of imposter influence on the agents (circles) in the network, and the color of the circles depict the action played in the welfare-minimizing SSS (green=, blue=). If , (recall (12)), the worst-case risk is achieved on a star graph of nodes where all nodes but one are targeted with a imposter. The one leaf node has an imposter attached, giving a single miscoordinating link in the network. (b) The worst-case risk from focused attacks (16). The graphs and their corresponding fixed sets which attain the worst-case focused risks are illustrated for , and . The nodes’ color represents the worst-case SSS at (blue , green ). The targeted fixed agents are represented as squares and the unfixed agents as circles. Here . The proofs establishing all worst-case graphs are detailed in Section 5.

In the notation of (6), the set of stochastically stable states is written . However for more specificity, we will refer to it in this context as . The induced network efficiency is defined as

(8)

which is the ratio of the welfare induced by the welfare-minimizing SSS to the optimal welfare. The second equality above is due to the fact that optimal welfare is attained at (all play ). We re-iterate that the imposter nodes serve only to modify the stochastically stable states, and do not contribute to the system welfare (3). The risk from broad attacks faced by the system operator in choosing gain is defined as

(9)

Risk measures the distance from optimal efficiency under operating gain . Fig. 0(a) illustrates an example of a three-node network subject to a broad adversarial attack. The extent to which systems are susceptible to broad attacks is captured by the following definition of worst-case risk.

Definition 1.

The worst-case risk to broad attacks is given by

(10)

The quantity is the cost metric that the system operator wishes to reduce given uncertainty of the network structure and target set.

Theorem 1.

Let . The worst-case broad risk is

(11)

where

(12)

It is a piecewise constant function on half-open intervals that is monotonically decreasing in . An illustration is given in Figure 1(a), along with the graphs and target sets that achieve the worst-case risks. For sufficiently high gains , the system is safeguarded from any broad adversarial attack, i.e. the worst-case risk is zero. By inflating the value of the -convention, the adversary is unable to induce any mis-coordinating links or agents to play . The technical results needed for the proof are given in Section 5.

3.2 Focused attacks and worst-case risk metric

An adversary is able to choose a strict subset of agents and force them to commit to prescribed choices. This causes them to act as fixed agents, or agents that do not update their choices over time. One could consider this as allowing the adversary an unlimited number of imposter nodes (instead of one) at its dispatch to attach to each agent in the subset, thereby solidifying their choices. This focused influence on a single agent is stronger than the influence a broad attack has on a single agent in the sense that the latter type does not require the agent to commit to a choice - it merely incentivizes the agent towards one particular choice.

Let be the set of fixed () agents. We call the fixed set , which satisfies and . We call the set of all feasible fixed sets on a graph . A fixed set restricts the action space to , where () () and . We assume the adversary selects at least one fixed agent. The strict subset assumption avoids pathological cases (e.g. alternating and fixed nodes for an entire line network yields an efficiency of zero).

The set of stochastically stable states given a fixed set is written as . However for brevity, we will refer to it as . The induced efficiency is

(13)

which is the ratio of the welfare induced by the worst-case stable state to the optimal welfare given the fixed set . The risk faced by the system operator in choosing is defined as

(14)

Again, risk measures the distance from optimal efficiency when choosing . The fixed nodes here differ from the imposter nodes in that they contribute to the true measured welfare (3) in addition to modifying the SSS by restricting the action set and influencing the decisions of their non-fixed neighbors. Figure 0(b) provides an illustrative example of a network with three fixed agents and one unfixed agent. The extent to which the system is susceptible to focused attacks is defined by the following worst-case risk metric.

Definition 2.

The worst-case risk from focused attacks is given by

(15)

The quantity is the cost metric that a system operator wishes to reduce given uncertainty on the graph structure and composition of fixed agents in the network.

Theorem 2.

The worst-case risk from focused attacks is

(16)

The technical results needed for the proof are given in Section 5. An illustration of this quantity as well as the graphs that induce worst-case risk are portrayed in Figure 1(b). We observe the choice recovers optimal efficiency for any and . In other words, by operating at the system gain , the system operator safeguards efficiency from any focused attack. Furthermore, monotonically increases for , approaching 1 in the limit . Intuitively, the risk in this regime comes from inflating the benefit of the convention, which can be harmful to system efficiency when there are predominantly fixed nodes in the network. For , monotonically decreases. The risk here stems from de-valuing the convention, which hurts efficiency when coordinating with fixed nodes is more valuable than coordinating with fixed nodes.

3.3 Fundamental tradeoffs between risk and security

We describe the operator’s tradeoffs between the two worst-case risk metrics. That is, given a level of security is ensured on one worst-case risk, what is the minimum achievable risk level of the other? These relations are direct consequences of Theorems 1 and 2.

Remark 1.

Before presenting the tradeoff relations, we first observe that since is decreasing on and is decreasing in , the operator should not select any gain , as it worsens both risk levels. Hence for the rest of this paper, we only consider gains greater than .

Corollary 1.

Fix . Suppose for some . Then

(17)
Proof.

From (16), implies . Since is a decreasing function in , we obtain the result. ∎

In words, as the security from worst-case focused attacks improves ( lowered), the risk from worst-case broad attacks increases. A tradeoff relation also holds in the opposite direction.

Corollary 2.

Fix . Suppose for some . Suppose for some . Then

(18)

If ,

(19)

If , then for any .

Proof.

All bounds are computed by finding s.t. . The relations and follow from the fact that is increasing in , and depending on whether can attain the resulting value. ∎

Here, as the security from worst-case broad attacks improves ( lowered), the risk from worst-case focused attacks increases. Each of the broad risk levels can be attained for a range of focused risks. An illustration of the attainable worst-case risk levels is given in Fig. 3 (blue).

4 Randomized operator strategies

In this section, we consider the scenario where the operator randomizes over multiple gains. We present a definition and a characterization of worst-case expected risks. We then identify the risk-security tradeoffs available in the randomized gain setting. We observe they significantly improve upon the deterministic gain setting (Fig. 3). We then identify ways to further improve these tradeoffs through different randomizations.

4.1 Worst-case expected risks

Suppose the operator selects a gain from the distinct values satisfying

with the probability distribution

. Here we denote as the set of all

-dimensional probability vectors. In other words, the operator employs the payoff gain

with probability .

We consider the following natural definitions of expected risks. Given a graph and target set , let be the expected adversarial risk of the operator’s strategy . The worst-case expected risk from broad attacks is defined as

(20)

Similarly, given a fixed set , let be the expected risk from focused attacks. The worst-case expected risk from focused attacks is defined as

(21)
Theorem 3.

Suppose the operator randomizes with gains according to . Then the worst-case expected broad risk is

(22)

The worst-case expected focused risk is

(23)

The proofs are given in Section 6. The characterization of worst-case expected risk is a discounted weighting of a deterministic worst-case risk level. This suggests that the risk levels achievable by randomization can improve upon the risks induced from a deterministic gain.

4.2 Risk tradeoffs under randomized operator strategies

Given a level of security

is ensured on one expected worst-case metric, what is the the minimum achievable risk level on the other? We find this can be calculated through a linear program. We formalize these tradeoffs in the following two statements, which are analogous to Corollaries

1 and 2.

Corollary 3.

Fix and a set of gains . Suppose for some . Then

(24)

where is the value of the following linear program.

(25)

where denotes elementwise , and are column -vectors of zeros and ones respectively, and is the matrix

(26)

Moreover, is decreasing in .

Proof.

We need to show equivalence between the linear program (25) and the optimization problem

(27)

Let be the matrix defined by the upper left block of (26) and by the bottom left block. From Theorem 3, we can express as the maximum element of the -vector , and similarly as the maximum element of . Hence, is the linear constraint for all . The objective itself can be cast as a linear objective with linear constraints, i.e. . Combining these two, we obtain (25). The claim is decreasing in follows as a consequence of the linear program (25). ∎

We note that a worst-case expected focused risk is not attainable because is the smallest gain it mixes with. Hence, the linear program (25) is infeasible for . The following tradeoff relation holds in the opposite direction.

Figure 3: Security-risk tradeoffs are depicted by the achievable worst-case risk levels from deterministic gains (blue) and randomized gains (red, green, black). The Pareto frontiers for three different randomized strategies , and , are shown in increasing order of improvement. The strategies and randomize over the highest three broad risk levels in addition to the lowest two. The strategy randomizes over the highest 298 broad risk levels and the lowest two. We chose the values as follows. For , we set , for , , and . We have set and . Hence, Par() improves upon Par() via Claim 1. For , we set , , , , and . Claim 2 ensures Par() improves upon Par(). We chose and .
Corollary 4.

Fix and a set of gains . Suppose for some . Then

(28)

where is the value of the following linear program.

(29)

where and are defined as the bottom and top left blocks of (26), respectively. Furthermore, is decreasing in .

We omit the proof as it is similar to that of Corollary 3. Note a worst-case expected broad risk is not attainable since is the highest gain it mixes with - (29) is infeasible for . Fig. 3 plots the best achievable risk levels of three randomized operator strategies (red, green, and black).

4.3 Improvement of risk tradeoffs

The tradeoff relations describe the best achievable level on one risk metric given the other is subject to a security constraint when the gains are fixed. One way to improve the achievable risks is to decrease the available gains.

Claim 1.

Let . Suppose (recall (12)), for some non-decreasing subsequence . Let satisfy with . Then for all , . Similarly, for all , .

Randomizing over additional gains can also improve the achievable risks.

Claim 2.

Suppose and with , and assume contains the elements of . Then the assertion of Claim 1 holds.

The proofs of the above two Claims follow directly from the formulation of the LPs (25), (29), and hence we omit them.

Fig. 3 depicts the best achievable risk levels of three randomized operator strategies of increasing improvement due to Claims 1 and 2 (red, green, and black curves). In particular, these plots constitute the Pareto frontier of all attainable expected risks among distributions given a fixed set of gains. That is, for any , we say a risk level belongs to the frontier Par() if there does not exist a such that . Within Par(), the operator can only improve upon one worst-case risk metric by sacrificing performance on the other.

From Corollary 4, the frontier given gains is the set of points

(30)

The parameter is upper bounded here by since any risk level with is unattainable under . Hence, the values and are equivalent for . The frontiers in Fig. 3 are generated by numerically solving the linear program (29) for a finite grid of points .

As we have seen, the transition from deterministic to randomized gains ensures a reduction of risk levels. Randomizing over only a few different gains substantially improves upon the attainable deterministic worst-case risks. However, a detailed quantification of such improvements remains a challenge due to the high dimensionality of the model. In particular, we have yet identified a “limit” frontier that could be obtained by repeated modifications to the gain vector detailed by Claims 1 and 2.

5 Proof of Theorems 1 and 2: Deterministic worst-case risks

In this section, we develop the technical results that characterize the worst-case risk metrics and (Theorems 1 and 2). Before presenting the proofs, we first present some preliminaries on potential games [25], which are essential to calculating stochastically stable states. We then define relevant notations for the forthcoming analysis.

5.1 Potential games

Graphical coordination games fall under the class of potential games - games where individual utilities are aligned with a global objective, or potential function. A game is a potential game if there exists a potential function which satisfies

(31)

for all , , and [25]. In potential games, the set of stochastically stable states (6) are precisely the action profiles that maximize the potential function [15, 16]. Specifically, . Our analysis relies on characterizing a potential function for the graphical coordination game in the presence of adversarial influences. This allows us to compute stochastically stable states in a straightforward manner.

5.2 Relevant notations for analysis

Any action profile on a graph decomposes into and -partitions. A node that belongs to a -partition (-partition) has (). The partitions are enumerated and , are mutually disjoint, and cover the graph. Each partition is a connected subgraph of . It is possible that with (when ), with (when ), or .

For any subset of nodes , let us denote

(32)

as the set of edges between and . We write as the complement of . We extensively use the notation

(33)

as the welfare due to edge set in action profile , where is of the form (1) with replaced by . For compactness, we will denote as for the local system welfare generated by the edges . Our analysis will also rely on the following mediant inequality.

Fact 1.

Suppose and for each . Then

(34)

We refer to the LHS above as the mediant sum of the .

5.3 Characterization of : worst-case broad risk

To prove Theorem 1, we seek a pair with of any size and , that minimizes efficiency (maximizes risk ). Our method to find the minimizer is to show any can be transformed into a star network with a particular target set that has lower efficiency, when . Thus, in this regime the search for the worst-case graph reduces to the class of star networks of arbitrary size. For , structural properties allow us to deduce the minimal efficiency.

The graphical coordination game defined by , perceived utilities (7), target set , and graph falls under the class of potential games [25]. A potential function is given by

(35)

where

(36)

Hence, the stochastically stable states are maximizers of (35). Suppose is the welfare-minimizing SSS inducing the partitions , . We can express its efficiency from (8) as

(37)

Note the denominator is simply the number of edges in multiplied by . From (35), each -partition in satisfies111Since we are seeking worst-case pairs , we may consider any -partition as only having imposters placed among its nodes. This is because any imposters that were placed in a resulting -partition can be replaced by -imposters and retain stability. We reflect this generalization in (5.3) and (5.3), where influence from only () imposters is considered.