Distributed Possibilistic Learning in Multi-Agent Systems

01/20/2020
by   Jonathan Lawry, et al.
0

Possibility theory is proposed as an uncertainty representation framework for distributed learning in multi-agent systems and robot swarms. In particular, we investigate its application to the best-of-n problem where the aim is for a population of agents to identify the highest quality out of n options through local interactions between individuals and limited direct feedback from the environment. In this context we claim that possibility theory provides efficient mechanisms by which an agent can learn about the state of the world, and which can allow them to handle inconsistencies between what they and others believe by varying the level of imprecision of their own beliefs. We introduce a discrete time model of a population of agents applying possibility theory to the best-of-n problem. Simulation experiments are then used to investigate the accuracy of possibility theory in this context as well as its robustness to noise under varying amounts of direct evidence. Finally, we compare possibility theory in this context with a similar probabilistic approach.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

05/13/2019

Evidence Propagation and Consensus Formation in Noisy Environments

We study the effectiveness of consensus formation in multi-agent systems...
09/18/2017

MAX-consensus in open multi-agent systems with gossip interactions

We study the problem of distributed maximum computation in an open multi...
12/11/2016

A Model of Multi-Agent Consensus for Vague and Uncertain Beliefs

Consensus formation is investigated for multi-agent systems in which age...
12/30/2019

Improved Structural Discovery and Representation Learning of Multi-Agent Data

Central to all machine learning algorithms is data representation. For m...
10/21/2019

Multi-agent Hierarchical Reinforcement Learning with Dynamic Termination

In a multi-agent system, an agent's optimal policy will typically depend...
03/06/2017

Context-Based Concurrent Experience Sharing in Multiagent Systems

One of the key challenges for multi-agent learning is scalability. In th...
04/20/2020

A Practical Guide to Studying Emergent Communication through Grounded Language Games

The question of how an effective and efficient communication system can ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

There is growing evidence that providing agents with a way of explicitly representing uncertainty in their beliefs can help to facilitate robust distributed learning for multi-agent systems operating in noisy environments. For example, [3] shows that robot swarms which use a third truth-value to represent “unknown” when solving best-of- problems [19] are more robust to noise and malfunction than those using binary belief models. In a similar context, [12] investigates the use of epistemic sets to represent an agents’ belief, corresponding to the set of states of the world which they deem to be possible. Extensive multi-agent simulations were then employed to test the scalability of such an approach when applied to large state spaces and varying numbers of agents. Here we extend this idea by introducing possibility theory [20, 6]

to capture graded possibilities, whereby instead of simply classifying states as being either possible or impossible, an agent can allocated a degree of possibility between

and to any given state. In particular, we explore the application of possibility theory to distributed multi-agent learning and more specifically to the best-of- problem in which a population of agents aims to identify the best out of possible choices (states) on the basis of feedback from the environment and local interactions between individuals.

The best-of- is a class of distributed learning and decision making problems with clear relevance to a range of applications including search and rescue [15], pollution treatment swarms and distributed task allocation [9]. Common approaches to best-of- in swarm robotics are often biologically inspired, for example by the behaviour of honeybees and Temnothorax ants when searching for nest sites [2, 18, 16]. However, it can also be tackled by using a combination of belief fusion between agents together with evidential updating by individuals based on, for example, sensor readings. The epistemological importance of this approach to distributed learning has been highlighted by [4] and has been studied using probabilistic models of belief in [13]. In the sequel we employ this combined approach in a possibility theory setting. Furthermore, in this context we will argue that the combination of fusion and evidential updating is more effective and robust to noise than evidential updating alone.

Possibility theory [5]

is a computationally efficient framework for reasoning about uncertainty which allows for a distinction to be made between uncertainty and ignorance. At a representational level there are clear analogies with probability theory. For instance, subjective belief is represented by a possibility distribution which allocates a degree of possibility between

and to each state of the world. The calculus for possibilities is maxitive in contrast to that for probabilities which is additive. In particular, the degree of possibility of a proposition, as represented by a set of states, is given by the maximum of the possibility values of those states. Possibility theory can also be understood as a special case of Dempster-Shafer theory [5] and consequently also of imprecise probability theory. From the latter perspective, possibility theory provides both an upper bound (possibility) and a lower bound (necessity) on the probability of any proposition, with the difference between these providing a measure of ignorance. This capacity to quantify ignorance then provides additional flexibility when fusing highly inconsistent beliefs, whereby any such combination will lead to an increase in overall ignorance. In the following we will exploit these properties for distributed learning to help facilitate consensus across a population of agents.

An outline of the remainder of the paper is as follows. Section 2 gives an overview of the basic concepts from possibility theory including a proposed approach to fusing possibility distributions. In Section 3 we described a discrete time agent-based model for possibility theory applied to the best-of- problem. Simulation experiment results are then described for this model at different levels of noise and for different evidence rates. In this context the possibilistic approach is compared with a probabilistic approach similar to that outlined in [13]. Finally, Section 4 gives some conclusions.

2 Possibility Theory

In this section we give a brief introduction to possibility theory and propose a parameterised family of fusion operators as an effective method for combining partially inconsistent beliefs. The underpinning concept is that of a possibility distribution which allocates a degree of possibility to each of a set of possible states of the world, denoted by , as follows:

Definition 1

Possibility Distribution
A possibility distribution on is a function such that . Let denote the set of all possibility distributions on .

A possibility distribution characterises a possibility and necessity measure defined over propositions as represented by sets of states. Intuitively, in a context in which knowledge is graded, the possibility measure of a proposition quantifies the degree to which it is consistent with an agent’s knowledge, while the necessity quantifies the degree to which it is entailed by that knowledge.

Definition 2

Possibility and Necessity Measures
A possibility distribution on generates a possibility and a necessity measure on such that, for ,

For notational simplicity we use as shorthand for .

The following properties of possibility and necessity measures are immediate consequences of Definition 2 [5]: For ,

  • and ,

  • and ,

  • and .

Possibility and necessity measures can also be interpreted as upper and lower probabilities respectively in the sense that there is a set of probability measures on propositions satisfying for all , and furthermore that and are respectively the infimum and supremum of this set. The level of ignorance about a proposition is then given by .

We now propose a family of fusion operators for combining possibility distributions which in the current context represent the beliefs of different agents in the population. This will require the notion of an intersection function, or t-norm, as given below. The intuition here is that if we consider the possibility distributions of two agents as defining two fuzzy sets of possible states [20] then one approach to fusing them is to take the intersection of these two sets [7].

Definition 3

Intersection Function (t-norm)
An intersection function (t-norm) [10] is a function
which satisfies the following properties:

  • For , .

  • For , if then .

  • For , .

  • For , .

Definition 4

Possibility Fusion Function
A possibility fusion function is a function
. In particular, we propose the following family of t-norm based pooling functions:

Definition 4 takes the fusion of two possibility distributions to be their intersection, normalised so that the maximum possibility of the fused possibility distribution is as required by Definition 1. The effect of the normalisation is to increase the level of ignorance if the two possibility distributions are highly inconsistent. We can argue that provides a good measure of the consistency of two possibility distributions and . Intuitively, and might be considered highly inconsistent if, for all states, allocates a high possibility values whenever allocates a low possibility value and vice versa. In this case will be low for all states since it follows from Definition 3 that  [10]. Consequently, the normalising term in Definition 4 will be high as will then be the possibility values of all states. Hence, according to Definition 2 in this case the possibility measure of all propositions, i.e. sets of states, will be close to and its necessity measure will be close . We now introduce Frank’s family of t-norms which generate a parameterised family of fusion operators when used in Definition 4.

Definition 5

Frank’s Family of t-norms [8]
Frank’s t-norms are a parameterised family of intersection functions of the form: For

Note that

Furthermore, Frank’s t-norms are increasing with such that implies .

In the case that an agent with possibility distribution needs to make a choice, for example by selecting a particular state to investigate further, then one approach is to sample from the pignistic distribution for  [17]

; the latter is a member of the set of probability distributions bounded above and below by

and respectively. Furthermore, it is arguably the least biased such distribution since it allocates probability proportionately to the difference between consecutive possibility values [17].

Definition 6

Pignistic Distribution
Let be a possibility distribution on and let the states be sorted so that , then the pignistic distribution for is the probability distribution on given by

Note that the pignistic distribution is order preserving in the sense that implies .

Example 7

Let and consider the possibility distribution such that , and . This generates possibility and necessity measures as given in Definition 2 such that

and

Hence, for the possibility distribution the degree of ignorance associated with the states , and is

respectively. The pignistic distribution for is generated according to Definition 6 as follows:

Now consider a second possibility distribution such that . We can fuse and by applying the operator in Definition 4. Taking and applying the Frank’s t-norm as in Definition 5 we obtain

In this case the normalising term is
. Hence, the fused possibility distribution is such that

Figure 1 shows the possibility distribution values for varying values of . In this example the normalisation term is . Since is increasing with then this term decreases with increasing . In other words, the amount of inconsistency generated by the fusion operator decreases as increases.

Frank’s parameter

Figure 1: The fused possibility distribution values for plotted against the Frank’s parameter .

3 Agent-Based Experiments

We adopt a discrete time agent-based model in order to study the macro-level convergence properties of a population of agents each attempting to individually solve a best-of- problem formulated using possibility theory. Let denote the set of agents in the population. Each state has a quality value which can be sampled with noise from the environment. For example, in a distributed search and rescue scenario, might correspond to a particular location and to the degree of support for there being casualties at that location. We assume without loss of generality that for . More specifically, in the experiments described below we take for , so that quality values are distributed uniformly across the interval .

Time

(a) and for fusion and evidential updating.

Time

(b) and for evidential updating only.
Figure 2: Average (black line) and (red line) against time for , , , and .

Agents’ beliefs about which is the best or highest quality state are represented by possibility distributions on , and at time all agents in are initialised as being completely ignorant so that for all . At each time step , two agents are selected at random to combine their beliefs by applying the fusion operator given in Definition 4. The aim here is to model systems with limited communications but mostly unconstrained mixing of agents. The type of swarm robotics or decentralised AI applications on which we are focusing will involve individuals moving independently through an environment and encountering a variety of different agents at different times. This can be modelled by a system where there is free mixing between agents, i.e. a totally connected interaction graph of agents, but where there are only relatively few interactions at any given time. The latter is referred to as a ‘well-stirred’ system in [14], corresponding to the assumption that each agent is equally likely to interact with any other agent in the population and that such interactions are independent events. Also, within a time step each agent selects a state to investigate by sampling at random from the pignistic distribution of their current possibility distribution (Definition 6). There is then a probability , referred to as the evidence rate, that the agent will succeed in sampling the quality value for their chosen state.

provides a simple quantification of the difficulty of obtaining direct evidence from the environment, or of traversing the environment in order to reach the chosen state. If evidence is received then it is with normally distributed noise, so that the agent receives quality

for state where

is a normally distributed random variable with mean

and standard deviation

. The agent then updates their belief as follow. The evidence received is represented by a possibility distribution such that111Since quality values are assumed to be in the range then we bound noise values so that the sampled quality is taken to be if and if .

The agent then updates their belief by fusing their current possibility distribution with the evidence to obtain .

In summary then we have outlined a discrete time model of a population of agents deployed on the best-of- problem which is comprised of two decentralised processes both implemented locally by individual agents: evidential updating and belief fusion. The defining parameters of this model are (the number of agents), (the number of states), (the evidence rate), (the noise) and (the Frank’s t-norm parameter). In the following subsection we present simulation results for this model for a variety of different parameter values to gain insight into the accuracy and robustness of the possibility theory based approach.

Frank’s parameter

(a) and for varying and .

Frank’s parameter

(b) and for varying and .
Figure 3: The average values of (black line) and (red line) at for varying both with and without noise for and .

3.1 Simulation Results

In the following experiments we simulated the model described above for agents and states while varying the remaining parameters. Each experiment, as characterised by a particular set of parameter values, is run for time steps and the results are then averaged over runs. In all cases averages are shown together with 10 and 90 percentiles represented as error bars. Figure 2 shows the average possibility and necessity values of the best state, i.e. , plotted against time. In this case we assume a evidence rate (), no noise and take . Figure 1(a) shows the results when both fusion and updating take place as described above. In this case the population converges to a shared belief for which as characterised by a possibility distribution where and for . In other words, agents are collectively converging on the correct answer with total certainty. Figure 1(b) shows the results when only evidential updating takes place, i.e. when there is no fusion of beliefs between agents. Here, convergence is much slower, with the population not having fully converged after time steps and agents still maintaining imprecision in their beliefs, as indicated by average necessity values being significantly lower than possibility values. This indicates that fusion can play a positive role in propagating evidence through the population. Furthermore, we will subsequently show that evidence-only learning is much less robust to noise. Initially, however, we consider the sensitivity of the combined model to the value of the Frank’s parameter .

Figure 3 shows the average values of and at time step when , plotted against as employed in Definition 5. Figure 2(a) is when , i.e. when there is no noise, and Figure 2(b) is when . In both cases we see that performance broadly increases with . Note that since Frank’s t-norm increases with (Definition 5), increasing decreases the normalisation term in Definition 4. In other words, Figure 2(a) suggests that the population is better at solving the best-of- when employing fusion operators that do not themselves generate significantly higher levels of inconsistency than is inherent to the two possibility distributions being fused. In light of this, for the remaining experiments we adopt a value of .

Noise

Figure 4: Probability that a sampled quality value for will be less than a sampled quality value for , plotted against .

Noise

(a) and for varying for both fusion and evidential updating

Noise

(b) and for varying for evidential updating only.
Figure 5: The average values of (black line) and (red line) after iterations for varying levels of noise, when , and .

We now investigate the robustness of the possibilistic approach to noise as represented by the standard deviation of the noise distribution . In order to give an indication of the effect of noise in this context we can consider the probability that a sampled quality value for , the best state when , is less than an independently sampled quality value for the second best state . This is shown in Figure 4 and plotted against the noise standard deviation . For example, when the probability that the order of these two quality values will be reversed is . Figure 5 shows the average values of and at time step , plotted against values of ranging from to . Figure 4(a) shows the results when both fusion and evidential updating are combined, while Figure 4(b) shows the results when there is only updating from direct evidence. These strongly suggest that combining both fusion between agents and evidential updating results in distributed learning that is much more robust to noise than learning based only on direct evidence. We conjecture that this in part due to the inconsistency handling of the fusion function given in Definition 4. Noise results in variation in quality sampling which in turn leads to inconsistency between agents. As described in Section 2, applying the fusion function to inconsistent possibility distributions results in an increase in imprecision, i.e. to a fused possibility distribution with high possibility values for a number of different states. The resulting pignistic probability distribution (Definition 6) will then give higher probability to those states so that their quality values will be more likely to be re-sampled. In summary, we suggest that sampling from a noisy environment leads to more inconsistency between different agents’ beliefs, and fusing those beliefs results in more imprecision which in turn results in more repeated sampling of a greater variety of states.

3.2 Comparison with Probability

In this subsection we directly compare the possibilistic model for best-of- with a probabilistic version. For the latter we adopt a discrete time model with the same structure as that of the possibilistic version but where agents’ beliefs are represented by probability distributions which are then combined using the product fusion operator as introduced in Definition 8 below.

Evidence rate

(a) Comparison of possibility and probability for
.

Evidence rate

(b) Comparison of possibility and probability for
.
Figure 6: The average values of (black line), (red line) and probability (blue line) for varying evidence rates . These are after iterations with , and .
Definition 8

Probability Fusion Function
Let denote the set of all probability distributions on . Then a probability fusion function is a function . In particular, we will focus on the product fusion function given by: for ,

This is an established probability fusion operator [1] which as been studied for agent-based models combining fusion and evidential updating in [13], and in this context has been shown to have strong convergence properties. Figure 6 compares the probabilistic and possibilistic models for and for a range of evidence rates. For the probabilistic model agents are initialised with the uniform probability distribution, to represent ignorance, and for the possibilistic model we take Frank’s parameter to be as above. For the probabilistic approach evidence from sampling the quality of state is represented by the following probability distribution:

where corresponds to the linear combination of the probability distribution with

and the uniform distribution, with weights

and , respectively. On receiving this evidence an agent with current probability distribution updates to the distribution where is the product fusion function given in Definition 8.

Figure 7 shows the average values of , and also the average probability obtained from the above probabilistic approach, plotted against time for time steps when , and . Clearly then, in such a scenario the possibilistic approach outperforms the probabilistic method in solving the best-of- problem under noise. However, the broader picture is somewhat more complex. For example, Figure 6 shows the average values of , and at time step for varying evidence rates. In particular, Figure 5(a) shows these results for evidence rates between and . These suggest that the possibilistic approach is more effective for lower evidence rates below (see Figure 5(b)). However, the probabilistic approach performs better for higher evidence rates. While it is not entirely clear why this is the case we suggest that it might be related to the varying speeds of convergence of the two different approaches. For instance, Figure 9 shows average possibility, necessity and probability values plotted against time for the higher evidence rate of , but in this case for up to time steps. Here we see that under possibilistic fusion and updating the population eventually converges to the same level of accuracy as obtained using the probabilistic approach but the former takes around times as long to converge as the latter. Given this, a possible explanation the difference in performance of the two approaches as summarised by Figure 6 is that for very low evidence rates the probabilistic fusion operators converges before sufficient evidence has been received to ensure that the highest quality state is correctly identified.

Time

Figure 7: (black line), (red line) and (blue line) plotted against time for , , and .

Average

(a) Distribution of the average population value of across runs.

Average

(b) Distribution of the average population value of across runs.
Figure 8: Histograms of simulation outcomes at across runs for both the probabilistic and possibilistic approaches when and .

Time

Figure 9: (black line), (red line) and (blue line) plotted against time over time steps, for , , and .

This is to some extent confirmed by Figure 8 which is a different presentation of the results shown in Figure 7. More specifically, Figures 7(a) and 7(b) are respectively histograms of the average values of and across the independent runs of the simulation when and . For the probabilistic approach each run converges to consensus where all agents have probability values for either close to or close to . Furthermore, there is significant proportion of runs in which the population reached consensus at close to . Note that the only stable fixed points of the probability fusion function given in Definition 8 are when both distributions give probability

to the same state. Hence, there is an inherent tendency for a population of agents employing this fusion function to converge to one of these distribution. Evidence will then tend to skew convergence towards the correct distribution i.e. the distribution for which

. However, in the case where there is very limited evidence and fast convergence then this skew may only be partial as we see in Figure 7(a). On the other hand, slower convergence when applying the possibility fusion function allows time, at least in the case of low evidence rates, for the distribution of outcomes across the different runs to be much more strongly skewed toward as can be seen in Figure 7(b). However, there is then the disadvantage for the possibilistic approach in that, as the evidence rate increases, the convergence time also increases as more (partially conflicting) evidence becomes available at each time step, so that time steps is no longer sufficient for the population to reach consensus on what is the best state as shown in Figure 5(a). Although given more time the population can converge on the correct answer as shown in Figure 9.

4 Conclusions

We have proposed an approach to distributed learning based on possibility theory, and shown that it can be effectively applied to the best-of- problem in a system where agents learn both directly from their environment and by fusing their beliefs with those of others. In particular, we have introduced a discrete time agent-based model and carried out a number of simulation experiments to study its meta-level properties and to gain insight into the performance of distributed possibilistic learning under varying evidence rates and levels of noise.

Our results suggest that in a distributed learning context, possibility theory can provide a useful framework for representing uncertainty. This representation allows for the imprecision of an agent’s belief to increase as a result of them encountering other beliefs which are partially inconsistent with their own. This in turn results in robust performance in the presence of noise, especially for low evidence rates for which the possibilistic approach outperforms a similar probabilistic model.

Acknowledgements

This work was funded and delivered in partnership between the Thales Group, University of Bristol and with the support of the UK Engineering and Physical Sciences Research Council, ref. EP/R004757/1 entitled “Thales-Bristol Partnership in Hybrid Autonomous System Engineering (T-B PHASE).”

References

  • [1] R. F. Bordley, (1982), A multiplicative formula for aggregating probability assessments, Management Science, 28(10), 1137–1148
  • [2] D. D. R. Burns, A. B. Sendova-Franks, N. R. Franks, The effect of social information on the collective choices of ant colonies, Behavioral Ecology, Volume 27, Issue 4, July-August 2016, 1033–1040
  • [3] M. Crosscombe, J. Lawry, S. Hauert, M. Homer, (2017), Robust distributed decision-making in robot swarms: Exploiting a third truth state, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 4326-4332
  • [4] I. Douven, C. Kelp, (2011), Truth approximation, social epistemology, and opinion dynamics, Erkenntnis, 75(2), 271-283
  • [5] D. Dubois, H. Prade, (1988), Possibility Theory, Plenum, New York
  • [6] D. Dubois, H. Prade, (2011), Possibility theory and its applications: where do we stand, Mathware and Soft Computing, 18(1), 18-31
  • [7] D. Dubois, W. Liu, J. Ma, H. Prade, (2016), The basic principles of uncertain information fusion. An organised review of merging rules in different representation frameworks, Information Fusion 32, 12-39
  • [8] M. J. Frank, (1979), On the simultaneous associativity of F(x, y) and x+y-F(x, y), aequationes mathematicae, 19(1), 194-226
  • [9] N. Kakalis, Y. Ventikos, (2008), Robotic swarm concept for efficient oil spill confrontation, Journal of Hazardous Material, 32, 12 – 39
  • [10] G. J. Klir, B. Yuan, (1995), Fuzzy Sets and Fuzzy Logic: Theory and Application, Prentice Hall.
  • [11] J. Lawry, (2001), Possibilistic normalisation and reasoning under partial inconsistency, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 9(4), 413-436
  • [12] J. Lawry, M. Crosscombe, D. Harvey, (2019), Epistemic sets applied to best-of-n problems, Proceedings of ECSQARU 2019, to appear
  • [13]

    C. Lee, J. Lawry, A. Winfield, (2018), Combining opinion pooling and evidential updating for multi-agent Consensus, Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18), 347-353

  • [14] C. Parker, H. Zhang, (2009), Cooperative decision-making in multiple-robot systems: the best-of-n problem, IEEE Transactions on Mechatronics 14, 240–251
  • [15] D. Peleg, (2005), Distributed coordination algorithms for mobile robot swarms, Lecture Notes in Computer Science, 1–12, Springer, Singapore
  • [16] A. Reina, J. Marshall, V. Trianni, T. Bose, (2017), Model of the best-of-n nest-site selection process in honeybees, Physical Review E, 95
  • [17] P. Smets, R. Kennes, (1994) The Transferable Belief Model. Artificial Intelligence, 66, 387–412
  • [18] G. Valentini, E. Ferrante, M. Dorigo, (2014), Self-organized collective decision making: The weighted voter model, Proceeding of AAMAS 14, 45–52
  • [19] G. Valentini, E. Ferrante, M. Dorigo, (2017), The best-of-n problem in robot swarms: Formalization, state of the art, and novel perspectives. Frontiers in Robotics and AI, 4(9).
  • [20] L. A. Zadeh, (1978), Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets and Systems, 100 supplement, 3-28