What you get is what you see: Decomposing Epistemic Planning using Functional STRIPS

by   Guang Hu, et al.
The University of Melbourne

Epistemic planning --- planning with knowledge and belief --- is essential in many multi-agent and human-agent interaction domains. Most state-of-the-art epistemic planners solve this problem by compiling to propositional classical planning, for example, generating all possible knowledge atoms, or compiling epistemic formula to normal forms. However, these methods become computationally infeasible as problems grow. In this paper, we decompose epistemic planning by delegating reasoning about epistemic formula to an external solver. We do this by modelling the problem using functional STRIPS, which is more expressive than standard STRIPS and supports the use of external, black-box functions within action models. Exploiting recent work that demonstrates the relationship between what an agent `sees' and what it knows, we allow modellers to provide new implementations of externals functions. These define what agents see in their environment, allowing new epistemic logics to be defined without changing the planner. As a result, it increases the capability and flexibility of the epistemic model itself, and avoids the exponential pre-compilation step. We ran evaluations on well-known epistemic planning benchmarks to compare with an existing state-of-the-art planner, and on new scenarios based on different external functions. The results show that our planner scales significantly better than the state-of-the-art planner against which we compared, and can express problems more succinctly.



There are no comments yet.



A General Multi-agent Epistemic Planner Based on Higher-order Belief Change

In recent years, multi-agent epistemic planning has received attention f...

Design of a Solver for Multi-Agent Epistemic Planning

As the interest in Artificial Intelligence continues to grow it is becom...

Knowledge Compilation in Multi-Agent Epistemic Logics

Epistemic logics are a primary formalism for multi-agent systems but maj...

Modelling Multi-Agent Epistemic Planning in ASP

Designing agents that reason and act upon the world has always been one ...

E-PDDL: A Standardized Way of Defining Epistemic Planning Problems

Epistemic Planning (EP) refers to an automated planning setting where th...

Improving Performance of Multiagent Cooperation Using Epistemic Planning

In most multiagent applications, communication is essential among agents...

Epistemic confidence, the Dutch Book and relevant subsets

We use a logical device called the Dutch Book to establish epistemic con...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Automated planning is a model-based approach to study sequential decision problems in AI Geffner and Bonet (2013). Planning models describe the environment and the agents with concise planning languages, such as STRIPS. It then submits the description of the model to a general problem solver in order to find a sequence of actions to achieve a certain desired goal state. The description of the problem, in general, tracks the changes in the state of the environment. However, in many scenarios, an agent needs to reason about the knowledge or belief of other agents in the environment. This concept is known as epistemic planning (Bolander and Andersen, 2011), a research topic that brings together the knowledge reasoning and planning communities.

Epistemic logic is a formal account to perform inferences and updates about an agent’s own knowledge and belief, including group and common knowledge in the presence of multiple agents Hintikka (1962). Epistemic planning is concerned about action theories that can reason not only about variables representing the state of the world, but also the belief and knowledge that other agents have about those variables. Thus, epistemic planning intends to find the best course of action taking into account practical performance considerations when reasoning about knowledge and beliefs (Bolander and Andersen, 2011). Bolander and Andersen (2011) first used event-based models to study epistemic planning in both single and multi agent environments, and gave a formal definition of epistemic planning problems using Dynamic Epistemic Logic (DEL) (Bolander, 2017).

There are typically two frameworks in which epistemic planning are studied. First, is to use DEL. This line of research investigates the decidability and complexity of epistemic planning and studies what type of problems it can solve Wan et al. (2015); Huang et al. (2017); Wu (2018). The second is to extend existing planning languages and solvers to epistemic tasks Muise et al. (2015a, b); Kominis and Geffner (2015, 2017); Le et al. (2018). In this paper, we take the latter approach.

The complexity of epistemic planning is undecidable in the general case. Thus, one of the main challenges of epistemic planning concerns computational efficiency. The dominant approach in this area it to rely on compilations. These solutions pre-compile epistemic planning problems into classical planning problems, using off-the-shelf classical planners to find solutions Muise et al. (2015a, b); Kominis and Geffner (2015, 2017); or pre-compile the epistemic formulae into specific normal forms for better performance during search Huang et al. (2017); Wu (2018). Such approaches have shown to be fast at planning, but the compilation is computationally expensive.

This paper departs from previous approaches in two significant ways. First, we propose a model that exploits recent insights defining what an agent knows as a function of what it ‘sees’ Cooper et al. (2016); Gasquet et al. (2014). Cooper et al. (2016) define ‘seeing relations’ as modal operators that ‘see’ whether a proposition is true, and then define knowledge of a proposition as ; that is, if is true and agent sees , then it knows . Thus, the seeing modal operator is equivalent to the ‘knowing whether’ operator Fan et al. (2015); Miller et al. (2016). We generalise the notion of seeing relations to perspective functions, which are functions that determine which variables an agent sees. The domain of variables can be discrete or continuous, not just propositional. The basic implementation of perspective functions is just the same as seeing relations, however, we show that by changing the definition of perspective functions, we can establish new epistemic logics tailored to specific domains, such as Big Brother Logic Gasquet et al. (2014): a logic about visibility and knowledge in two-dimensional Euclidean planes.

Second, we show how to integrate perspective functions within functional STRIPS models as external functions Francès et al. (2017). External functions are black-box functions implanted in a programming language (in our case, C++), that can be called within action models. Epistemic reasoning is delegated to external functions, where epistemic formulae are evaluated lazily, avoiding the exponential blow-up from epistemic formulae present in other compilation-based approaches. This delegation effectively decomposes epistemic reasoning from search, and allows us to implement our approach in any functional STRIPS planner that supports external functions. Further, the modeller can implement new perspective functions that are tied to specific domains, effectively defining new external solvers.

In our experiments we use a width-based functional STRIPS planner Francès et al. (2017) that is able to evaluate the truth value of epistemic fluents with external solvers, and solve a wide range of epistemic problems efficiently, included but not limited to, nested knowledge, distributed knowledge and common knowledge. We also show how modellers can implement different perspective functions as external functions in the functional STRIPS language, enabling the use of domain-dependent epistemic logics. Departing from propositional logic give us flexibility to encode expressive epistemic formulae concisely. We compare our approach to a state-of-the-art epistemic planner that relies on a compilation to classical planning Muise et al. (2015a). The results show that, unlike in the compilation based approaches, the depth of nesting and the number of agents does not affect our performance, avoiding the exponential blow up due to our lazy evaluation of epistemic formulae.

In the following sections we give a brief background on both epistemic logic and epistemic planning (Section 2). We then introduce a new model, the agent perspective model (Section 3). We discuss implementation details using a functional STRIPS planner (Section 4), and report experiments on several well-known benchmarks, along with two new scenarios to demonstrate the expressiveness of the proposed approach (Section 5).

2 Background

In this section, we briefly introduce the three main areas related to this work: (1) classical planning; (2) epistemic logic; and (3) epistemic planning.

2.1 Classical Planning

Planning is the model based approach to action selection in AI, where the model is used to reason about which action an agent should do next Geffner and Bonet (2013). Models vary depending on the assumptions imposed on the dynamics of the world, from classical where all actions have deterministic instantaneous effects and the world is fully known, up to temporal or POMDP models, where actions have durations or belief distributions about the state of the world. Models are described concisely through declarative languages such as STRIPS and PDDL Fikes and Nilsson (1971); McDermott (2000), general enough to allow the encoding of different problems, while at the same time revealing important structural information that allow planners to scale up. In fact, most planners rely on exploiting the structure revealed in the action theory to guide the search of solutions, from the very first general problem solver Simon and Newell (1963)

up to the latest computational approaches based on SAT, and heuristic search

Rintanen (2012); Richter and Westphal (2010). On the other hand, declarative languages have limited the scope of planning, as certain environments representing planning models are difficult to encode declaratively, but are easily defined through simulators such as the Atari video games Bellemare et al. (2013). Consequently, a new family of width-based planners Lipovetzky and Geffner (2012, 2014, 2017) have been proposed, broadening the scope of planning and have been shown to scale-up even when the planning model is described through simulators, only requiring the exposure of the state variables, but not imposing any syntax restriction on the action theory Francès et al. (2017), where the denotation of some symbols can be given procedurally through simulators or by external theories.

In this paper we focus on epistemic planning, which considers the classical planning model as a tuple where is a set of states, is the initial state, is the set of goal states, is the set of actions, is the subset actions applicable in state , is the transition function so that represents the state that results from doing action in the state , and is a cost function. The solution to a classical planning model , called a plan, is a sequence of actions that maps the initial state into a goal state, i.e., is a plan if there is a sequence of states such that and for such that . The cost of plan is given by the sum of action costs and a plan is optimal if there is no plan with smaller cost.

A classical planning model can be represented in STRIPS by a tuple where: is the set of all possible facts or propositions, the set of all actions, a set of all true facts in the initial situation, and a set of facts that needs to be true as the goal conditions.

Besides the model, a solver, which is also called a planner, plays another important role in planning. One of the most successful computational approaches to planning is heuristic search. Besides the choice of search algorithm, the key feature which distinguishes planners is the heuristic function chosen Helmert and Domshlak (2009). To achieve good performance, the heuristic funstions should be as informed as possible. For example, one of the current best-performing planner, LAMA, uses a landmark-based heuristic derived from the model Richter and Westphal (2010) along with other delete-relaxation heuristics Geffner and Bonet (2013). The downside is that most heuristics require the model to be encoded in STRIPS or PDDL, but not all the problem can be easy to modeled declaratively.

The standard planning languages and solvers do not support the use of procedures or external theories. One exception is the FS planner Francès and Geffner (2015), that supports the Functional STRIPS language Geffner (2000), where the denotation of (non-fluent) function symbols can be given extensionally by means of atoms, or intensionally by means of procedures. Procedures also appear as an extension of PDDL under the name of semantic attachments Dornhege et al. (2009). The reason why procedures are not “first-class citizens” in planning languages is that there was no clear way to deal with them that is both general and effective. Recently, a new family of algorithms have been proposed in classical planning known as Width-based planning Lipovetzky and Geffner (2012). The latest planner known as Best-First Width Search (BFWS) Lipovetzky and Geffner (2017), has been shown to scale up even in the presence of functional symbols defined procedurally Francès et al. (2017). exploits a concept called novelty in order to prune or guide the search. The novelty of a new state is determined by the presence of values in its state variables that are seen for the first time during the search. It keeps checking when generating search nodes whether there is something novel about them. The novelty of the node is the size of the minimal tuple of state variables that are seen for the fist time during the search.

The f-STRIPS planner using has been compared over 380 problems with respect of the performance of FF* Helmert (2006), LAMA-11 Richter and Westphal (2010), and BFWS(Lipovetzky and Geffner (2017). The results show that the f-STRIPS BFWS planner performances is as good as BFWS(), which relies on a STRIPS model, and slightly better than LAMA-11. It is worth to mention that FF and LAMA have been the top-performing planners for last 15 years, and BFWS() has been shown to win the agile track in the last International Planning Competition 2018, and is a state-of-art planner for classical planning. The f-STRIPS BFWS planner thus can cope with externally defined functional symbols while performing well with respect to other planners. Therefore, we choose their planner as our developing tool.111 The planner is available through https://github.com/aig-upf/2017-planning-with-simulators.

2.2 Epistemic Logic

In this section, we give the necessary preliminaries for epistemic logic – the logic of knowledge. Knowledge in a multi-agent system is not only about the environment, but also about the agents’ knowledge about the environment, and of agents’ knowledge of others’ knowledge about the environment, and so on.

Fagin et al. Fagin et al. (2003) provides a formal definition of epistemic logic as follows. Given a set of countable set of all primitive propositions and a finite set of agents , the syntax for epistemic logic is defined as:

in which and .

represents that agent knows proposition , means negation and means conjunction. Other operators such as disjunction and implication can be defined in the usual way.

Fagin et al. Fagin et al. (2003) define the semantics of epistemic logic using Kripke structures, as standard in modal logic. A Kripke structure is a tuple where:

  • is a non-empty set of states;

  • is an interpretation function: ; and

  • represents the accessibility relations over states for each of the agents in .

Given a state and a proposition , the evaluation of over is . If is true in , then must be , and vice-versa. for agent is a binary relation over states, which is the key to reason about knowledge. For any pair of states and , if , then we can say agent cannot distinguish between and when in state . In other words, if agent is in , the agent can consider as the current state based on all the information it can obtain from state . With this definition of Kripke structure, we can define the semantics of knowledge.

Given a state , a proposition , a propositional formula and a Kripke structure , the truth of two basic formulae are defined as follows:

  • iff =

  • iff for all such that

represents is true at , which means, the evaluation of in state , must be true. Standard propositional logic rules define conjunction and negation. is defined by agent being true at all worlds reachable from via the accessibility relation . This allows knowledge to be nested; for example, represents that agent knows that agent knows , which means is true at all worlds reachable by applying accessibility relation followed by .

From these basic operators, the concept of group knowledge can be defined. For this, the grammar above is extended to:

in which , , and is a non-empty set of agents such that .

represents that everyone in group knows and represents that it is commonly known in group that is true, which means that everyone knows , and everyone knows that everyone knows , ad infinitum. represents distributed knowledge, which means we combine the knowledge of the set of agents such that knows , even though it may be that no individual in the group knows .

The semantics for these group operators are defined as follows:

  • iff for all ;

  • iff for all that are -reachable from

  • iff for all such that

By definition, will be true, if and only if, is known by all agents in . Common knowledge in world , is defined as: in all worlds , which are reachable by following the accessibility relations of all agents in , is true. For distributed knowledge, is true, if and only if, in all the possible worlds that any one agent from consider possible, is true; in other words, we eliminate worlds that any agent in knows to be impossible.

2.2.1 Seeing and Knowledge

Recently Gasquet et al.Gasquet et al. (2014) noted the relationship between what an agent sees and what it knows. They define a more specific task of logically modeling and reasoning about cooperating tasks of vision-based agents, which they called Big Brother Logic (BBL). Their framework models multi-agent knowledge in a continuous environment of vision, which has great potential applications such as reasoning over cameras inputs, autonomous robots and vehicles. They introduced the semantics of their model and its extensions on natural geometric models.

(, )

(, )

(, )

(, )

(, )

(, )
Figure 1: Example for big brother logic

In their scenario, agents are stationary cameras in a Euclidean plane , and they set the assumptions that those cameras can see anything in their sight range, and they do not have volume, which means they would not block others’ sight. They extend Fagin et al.’s logic Fagin et al. (2003) by noting that, at any point in time, what an agent knows, including nested knowledge, can be derived directly from what it can see in the current state. In brief, instead of Kripke frames, they define a geometric model as , in which:

in which

is the set of unit vectors of

, the function gives the position of each agent, the function gives the direction that each agent is facing, and the function gives the angle of view for each agent.

A model is defined as , in which and are as above, is the set of functions and the set of equivlance relations, one for each agent , defined as:

In this context, standard propositional logic is extended with the binary operator , which represents that “ sees ”. This is defined as:

  • iff ,

in which is the field of vision that begins at from direction and goes degrees in a counter-clockwise direction.

Figure 1 shows an example with two agents, and , and model and respectively, along with four objects, , , and . Based on the current state, for agent , we have:
; ; and, .

From this, they show the relationship between seeing and knowing. For example, is defined as .

Gasquet et al. Gasquet et al. (2014) also define a common knowledge operator, defined in a similar manner to that of Fagin et al.’s definition based on -reachable worlds Fagin et al. (2003). In Figure 1, the formula holds because and can both see , and can both see that each other sees , etc.

Cooper et al. Cooper et al. (2016) adopted Gasquet et al. Gasquet et al. (2014)’s idea of modelling an agent’s knowledge based on what it sees, and generalised it to seeing propositions, rather than to just to seeing other agents in a Euclidean plane. They extended Fagin et al.’s definition by adding an extra formula , which represents formula about what can be seen:

in which (the set of propositional variables) and . The grammar for defines how to represent visibility relations. can be read as “agent sees ”. Note the syntactic restriction that agents can only see atomic propositions or nestings of seeing relationships that see atomic propositions.

From this, they define knowledge using the equivalences and . This tight correspondence between seeing and knowledge is intuitive: an agent knows is true if is true and the agent can see the variable . Such a relationship is the same as the relationship between knowing something is true and knowing whether something is true Miller et al. (2016); Fan et al. (2015). In fact, in early drafts of the work available online, Cooper et al. Cooper et al. (2016), was written and was called the “knowing whether” operator.

Comparing these two bodies of work, Gasquet et al. (2014) use a geometric model to represent the environment and derive knowledge from this by checking the agents’ line of sight. Their idea is literally matching the phrase “Seeing is believing”. However, their logic is constrained only to vision in physical spaces. While in Cooper et al. (2016)’s world, the seeing operator applies to propositional variables, and thus visibility can be interpreted more abstractly; for example, ‘seeing’ (hearing) a message over a telephone.

The current paper generalises seeing relations to perspective functions, which are domain-dependent functions defining what agents see in particular states. The result is more flexible than seeing relations, and allows Big Brother Logic to be defined with a simple perspective function, as well as new perspective functions for other domains; for example, Big Brother Logic in three-dimensional planes, or visibility of messages on a social network.

2.3 Epistemic Planning

Bolander and Andersen (2011) introduce the concept of epistemic planning, and define it in both single agent and multi-agent domain. Their planning framework is defined in dynamic epistemic logic (DEL) Van Ditmarsch et al. (2007), which has been shown to be undecidable in general, but with decidable fragements Bolander et al. (2015). In addition, they also provided a solution to PSPACE-hardness of the plan verification problem. This formalism has been used to explore theoretical properties of epistemic planning; for example, Engesser et al. Engesser et al. (2017) used concepts of perspective shifting to reason about other’s contribution to joint goals. Along with implicit coordination actions, their model can solve some problems elegantly without communication between agents.

Since epistemic planning was formalized in DEL, there has been substantial work on DEL-based planning. However, in this paper, our focus is on the design, implementation, and evaluation of planning tools, rather than on logic-based models of planning. In this section, we focus on research in epistemic planning that is focused on the design, implementation, and evaluation of planning algorithms.

A handful of researchers in the planning field focus on leveraging existing planners to solve epistemic problems. Muise et al. Muise et al. (2015a) proposed an approach to multi-agent epistemic planning with nested belief, non-homogeneous agents, co-present observation, and the ability for one agent to reason as if it were the other. Generally, compiled an epistemic logic problem into a classical planning problem by grounding epistemic fluents into propositional fluents and using additional conditional effects of actions to enforce desirable properties of belief. They evaluated their approach on the Corridor Kominis and Geffner (2015) problem and the Grapevine problem, which is a combination of Corridor and Gossip Herzig and Maffre (2015). Their results show that their approach is able to solve the planning task within a typically short time, but the compilation time to generate fluents and conditional effects is exponential in the size of the original problem.

Simultaneously, Kominis and Geffner Kominis and Geffner (2015) adopted methods from partially-observable planning for representing beliefs of a single agent, and converted that method to handle multi-agent setting. They define three kinds of actions sets , and . represents all physical actions, which is the same as in classical planning. The actions set denotes a set of sensing actions, which can be used to infer knowledge. The last action set is used to update any of the fact . They are able to use their model to encode epistemic planning problems, and can convert their model to a classical planning model using standard compilation techniques for partially-observable planning. They evaluated their model on Muddy Children, Sum and Word Rooms Kominis and Geffner (2015) domains. Their results show that their model is able to handle solve all cases presented, however, the performance as measured by time and depth of the optimal plan is indirectly related to the problem scope and planning algorithm. For example, by using two different planners, and BFS(), the Muddy Children test case with seven children, the time consumed by BFS() is only half of , while for other problems, there is little difference between the two.

Since Kominis and Geffner, and Muise et al.’s approach the problem in a similar way (compilation to classical planning) their results are similar in a general way. However, the methods they used are different, so, their work and results have diverse limitations and strengths. For Muise et al.’s work, they managed to modeling nested beliefs without explicit or implicit Kripke structures, which means they can only represent literals, while Kominis and Geffner’s work is able to handle arbitrary formulae. Furthermore, Muise et al.’s model does not have the strict common initial knowledge setting found in Kominis and Geffner, and does not have the constraint that all action effects are commonly known to all the agents. Therefore, Muise et al.’s model allows them to model beliefs, which might be not what is actual true in the world state, rather just model knowledge. In other words, they can handle different agents having different belief about the same fluent.

More recently, rather than compiling epistemic planning problems into classic planning, Huang et al. built a native multi-agent epistemic planner, and proposed a general representation framework for multi-agent epistemic problems Huang et al. (2017). They considered the whole multi-agent epistemic planning task from a third person point of view. In addition, based on a well-established concepts of believe change algorithms (both revision and update algorithms), they design and implemented an algorithm to encode belief change as the result of planning actions. Catering to their ideas and algorithms, they developed a planner called MEPK. They evaluated their approach with Grapevine, Hexa Game and Gossip, among others. From their results, it is clear that their approach can handle a variety of problems, and performance on some problems is better than other approaches. While this approach is different from Kominis and Geffner and Muise et al., it still requires a compilation phase before planning to re-write epistemic formula into a specific normal form called alternating cover disjunctive formulas (ACDF) Hales et al. (2012). The ACDF formula is worst-case exponentially longer than the original formula. The results show that this step is of similar computational burden as either Kominis and Geffner or Muise et al.. In addition, building a native planner to solve an epistemic planning problem makes it more difficult to take advantage of recent advances in other areas of planning.

3 A Model of Epistemic Planning using Perspectives

In this section, we define a syntax and semantics of our agent perspective model. Our idea is based on that of Big Brother Logic Gasquet et al. (2014) and Cooper et al.’s seeing operators Cooper et al. (2016). The syntax and semantics of our model is introduced, including distributed knowledge and common knowledge.

3.1 Language

Extending Cooper et al.’s Cooper et al. (2016) idea of seeing propositional variables, our model is based on a model of functional STRIPS (F-STRIPS) Francès et al. (2017), which uses variables and domains, rather than just propositions. We allow agents to see variables with discrete and continuous domains.

3.1.1 Model

We define an epistemic planning problem as a tuple , in which is a set of agents, is a set of variables, stands for domains for each variable, in which domains can be discrete or continuous, and are the initial state and goal states respectively, and both of them are also bounded by and . Specifically, they should be assignments for some or all variables in , or relations over these. is the set of operators, with argument in the terms of variables from .

3.1.2 Epistemic Formulae

Definition 3.1.1:

Goals, actions preconditions, and conditions on conditional effects, are epistemic formulae, defined by the following grammar:

in which is -arity ‘domain-dependent’ relation formula, , and are both are visibility formulae, and is a knowledge formula.

3.1.3 Domain Dependent Formulae

Domain-dependent formula not only including basic mathematical relations, but also relational terms defined by the underlying planning language can have the relation between variables. For example, based on the scenario from Figure 1, is a true formula expression the location of agent , while is a false. In Section 4, we discuss the use of external functions in F-STRIPS in our implemented planning, which allow more complex relations, or even customized relations, as long as they have been defined in the external functions. For example, we can define a domain dependent relation in external function to compare distance between objects, called . This external function takes three coordinates as input, and returns a Boolean value, whether distance between and is longer than and . In the scenario displayed in Figure 1, the relation would be true, while relation would be false, since and are equally close to . The definition of this function is delegated to a function implemented in a programming language such as C++, and the planner is unaware of its semantics. However, for the remainder of this section, we will ignore the existence of external functions, and return to them in our implementation section.

3.1.4 Visibility Formula

An important concept adapted from Cooper et al. (2016) is “seeing a proposition”. Let be a proposition, “agent knows whether ” can be represented as “agent sees ”. Their interpretation on this is: either is true and knows that; or, is false and knows that. With higher-order observation added, it gives us a way to reason about others’ epistemic viewpoint about a proposition without actually knowing whether it is true. Building on this concept, our ‘seeing’ operator allows us to write formulae about visibility: and, .

The seeing formulae represent two related interpretations: seeing a variable; or seeing a formula. The formula can be understood as variable has some value, and no matter what value it has, agent can see the variable and knows its value. On the other hand, seeing a relation is trickier. can be interpreted as: one relation is a formula and no matter whether is true or false, knows whether it is true or not. To make sure knows whether is true or not, the evaluation for this seeing formula is simply that agent sees the variables in that relation.

For example, in Figure 1, using the notation defined in Section 3.1.1, can be read as “agent sees variable ”. In the case of seeing a domain-dependent formula, can be read as “agent sees whether the relation is true or not”, which is: “agent sees whether is farther away from than .”

3.1.5 Knowledge Formula

In addition to the visibility operator, our language supports the standard knowing operator . Following novel idea from Cooper et al.Cooper et al. (2016) on defining knowledge based on visibility Cooper et al. (2016), we define knowledge as: . That is, for to know is true, it needs to be able to see , and needs to be true. In other words, if you can see something and it is true, then, you know it is true.

3.2 Semantics

We now formally define the semantics of our model.

3.2.1 Knowledge Model

Our model decomposes the planning model from the knowledge model, and as we will see in our implementation, our planner delegates the knowledge model to an external solver. Therefore, in this section, we define the semantics of that knowledge solver. The novel part of this model is the use of perspective functions, which are functions that define which variables an agent can see, instead of Kripke structures. From this, a rich knowledge model can be built up independent of the planning domain.

Definition 3.2.1:

A model is defined as , in which is a set of variables, is a function mapping domains for each variable, is the evaluation function, which evaluates variables and formulae based on the given state, and are the agents’ perspective functions that given a state , will return the local state from agents’ perspectives.

A state is a tuple of variable assignments, denoted , in which and for each . The global state is a complete assignment for all variables in . A local state, which represents an individual agent’s perspective of the global state, can be a partial assignment. The set of all states is denoted .

A perspective function, is a function that takes a state and returns a subset of that state, which represents the part of that state that is visible to agent . These functions can be nested, such that represents agent ’s perspective of agent ’s perspective, which can be just a subset of agent ’s actual perspective. The following properties must hold on for all and :

First, we give the definition for propositional formulae. Let any variable be denoted as , and any -ary domain dependent formula:

  • iff

Relations are handled by evaluation function . The relation is evaluated by getting the value for each variable in , and checking whether stands or not. Other propositional operators are defined in the standard way.

Then, we have the following formal semantics for visibility:

  • iff for any value

  • iff

  • iff

  • iff and

  • iff

  • iff

, read “Agent sees variable ”, is true if and only if is visible in the state . That is, an agent sees a variable if and only if that variable is in its perspective of the state. Similarly, an agent knows whether a domain-dependent formula is true or false if and only if it can see every variable of that formula. For example, in Figure 1, is false and is true, which is because is in ’s perspective (blue area), while is not. The remainder of the definitions simply deal with logical operations. However, note the definition of , which is equivalent to , because all variables are the same. This effectively just defines that ‘seeing’ a formula means seeing its variables.

Now, we define knowledge as:

  • iff and

This definition follows Cooper et al.’s definition Cooper et al. (2016): agent knows if and only if the formula is true at and agent sees it. Using the same example as previously, is false, while is true.

Theorem 3.1.

The S5 axioms of epistemic logic hold in this language. That is, the following axioms hold:


We first consider axiom (T). By our semantics, is true, if and only if, both and are true. Therefore, it is trivial that axiom (T) holds.

For (K), based on our definition of knowledge, we have is equivalent to and . Then, by our semantics, we have that is equivalent to or . From propositional logic, is equivalent to . We combine with to get and similarly for to get , which is equivalent to from propositional logic.

To prove (4) and (5), we use the properties of the perspective function . The second property shows, a perspective function for agent on state converges after the first nested iteration, which means . Therefore, whenever , then also holds in , implying that holds too. This is the case when is or , so (4) and (5) hold. ∎

3.3 Group Knowledge

From the basic visibility and knowledge definitions, in this section, we define group operators, including distributed and common visibility/knowledge.

3.3.1 Syntax

We extend the syntax of our language with group operators:

in which is a set of agents and is a variable or formula .

Group formula is read as: everyone in group sees variable/formula , and represents that everyone in group knows . is the distributed knowledge operator, equivalent to in Section 2.2, while is its visibility counterpart: someone in group sees. Finally, is common knowledge and common visibility: “it is commonly seen”.

3.3.2 Semantics

Let be a set of agents, a formula, and either a formula or a variable, then we can define the semantics of these group formula as follows:

  • iff ,

  • iff ,

  • iff , where

  • iff and

  • iff , where

  • iff and ,

in which is state reached by applying composite function until it reaches its fixed point. That is, the fixed point such that .

Reasoning about common knowledge and visibility is more complex than other modalities. Common knowledge among a group is not only everyone in the group shares this knowledge, but also everyone knows other knows this knowledge, and so on, ad infinitum. The infinite nature of this definition leads to some definitions that are untractable in some models.

However, due to our restriction on the definition of states as variable assignments and our use of perspective functions, common knowledge is much simpler. Our intuition is based on the fact that each time we apply composite perspective function , the resulting state is either a proper subset of (smaller) or is . By this intuition, we can limit common formula in finite steps.

The fixed point is a recursive definition. However, the following theorem shows that this fixed point always exists (even if it is empty), and the number of iterations is bound by the size of , the state to which it is applied.

Theorem 3.2.

Function converges on a fixed point within iterations.


First, we prove convergence is finite; that is, when , further ‘iterations’ will result in ; that is, . Let , where is the number of iterations. Then, we have . Since , we have , which means . Via induction, we have that for all , . Therefore, once we reach convergence, it remains.

Next, we prove convergence within iterations. By the intuition and definition of the perspective functions, , and , we have . Then, as we prove for the first point, if , then we have reached a fixed point and no further iterations are necessary. Therefore, the worst case is when and . There are most such worst case iterations until converges on an empty set. Therefore, the maximum number of iterations is . ∎

For each of the iteration, there are local states in group that need to be applied in the generalised union calculation, which can be done in polynomial time, and there there are at most steps. So, a poly-time algorithm for function exists.

3.4 A brief note on expressiveness

The intuitive idea about perspective functions is based on what agents can see, as determined by the current state. The relation between and corresponds roughly to accessibly relations in Kripke semantics. However, only focusing on what agent exactly knows/sees means overlooking those variables that agents are uncertain about. Perspective functions return one partial world that the agent is certain about, rather than a set of worlds that considers possible. The advantage is that applying a perspective function provides us only one state, rather than multiple states in Kripke semantics, preventing explosions in model size.

However, the reduced complexity loses information on the “unsure” variables. Theoretically, from our model is a set intersection from all the that . This eliminates disjunctive knowledge about variables; the only uncertainty being that an agent does not see a variable. For example, in the well-known Muddy Child problem, the knowledge is not only generated by what each child can see by the others’ appearance, which is modelled straightforwardly using perspective functions, but also can be derived from the statements made by their father and the response by other children. From their perspective, they would know exactly children are dirty, which can be handled by our model, as they are certain about it. While by the -th times the father asked and no one responds, they can use induction and get the knowledge that at least children are dirty. By considering that there are two possible worlds, where the number of dirty children is or , Kripke structures keep both possible worlds until steps. Our model cannot keep this.

Therefore, although our model can handle preconditions and goals with disjunction, such as , it cannot store such disjunction in its ‘knowledge base’. Rather, agent knows ’s value is or its knows it is .

Despite this, possible worlds can be represented using our model, using the same approach taken by Kominis and Geffner (2015) of representing Kripke structures in the planning language. We can then define perspective functions that implement Kripke semantics.

To summarise Section 3, we have defined a new model of epistemic logic, including group operators, in which states in the model are constrained to be just variable assignments. The complexity of common knowledge in this logic is bound by the size of the state and number of agents. In the following section, we show how to implement this in any F-STRIPS planner that supports external functions.

4 Implementation

To validate our model and test its capabilities, we encode it within a planner and solve on some well-known epistemic planning benchmarks. Two key aspects in planning problem solving is the planning language and solver. Since, as mentioned in section 2.1, we use BFWS()Francès et al. (2017), the advantage of F-STRIPS with external functions allows us to decompose the planning task from the epistemic logic reasoning. In this section, we encode our model into the F-STRIPS planning language and explain our use of external functions for supporting this.

4.1 F-STRIPS encoding

Any classical F-STRIPS Francès et al. (2017) problem can be represented by a tuple , where and are variables and domains, , , are operators, initial state, and goal states. stands for the external functions, which allow the planner to handle problems that cannot be defined as a classical planning task, such as, the effects are too complex to be modelled by propositional fluents, or even the actions and effects has some unrevealed corresponding relations. In our implementation, external functions are implemented in C++, allowing any C++ function to be executed during planning.

In our epistemic planning model defined in Section 3, is a set of agent identifiers, and and are exactly same as in F-STRIPS. Operators , initial states and goal states differ to F-STRIPS only in that they contain epistemic formulae in preconditions, conditions, and goals.

4.1.1 Epistemic Formula in Planning Actions

There are two major ways to include epistemic formula in planning, using the formula as preconditions and conditions (on conditional effects) in operators, or using as epistemic goals.

Setting desirable epistemic formulae as goals is straightforward. For example, in Figure 1, if we want agent to know sees , we could simply set the goal be . However, there are some other scenarios that cannot be modeled by epistemic goals, including temporal constraints such as “agent sees all the time” or “target needs to secretly move to the other side without been seeing by any agents”. Each of those temporal epistemic constraints can be handled by a Boolean variable and getting the external functions to determine whether the constraint holds at each state of the plan. If the external function returns false, we ‘fail’ that plan.

In addition, epistemic formula can be in the precondition of the operators directly. For example, in Figure 1, if the scenario is continued surveillance of over the entire plan, then the operator turn() would have that either as the precondition, after turns degree as one of the preconditions.

As for the encoding , and , besides the classical F-STRIPS planning parts, all the formulae are encoded as calls to external functions.

4.2 External Functions

External functions take variables as input, and return the evaluated result based on the current search state. It is the key aspect that allows us to divide epistemic reasoning from search. Therefore, unlike compilation approaches to epistemic planning that compile new formula or normal forms that may never be required, our epistemic reasoning using lazy evaluation, which we hypothesise can significantly reduce the time complexity for most epistemic planning problems.

4.2.1 Agent perspective functions

As briefly mentioned in Section 3.2.1, the perspective function, , is a function that takes a state and returns the local state from the perspective of agent . Compared to Kripke structures, the intuition is to only define which variables an agent sees. Individual and group knowledge all derive from this.

Once we have domain-specific perspective functions for each agent, or just one implementation for homogenous agents, our framework implementation takes care of the remaining parts of epistemic logic. Given a domain-specific perspective function, we have implemented a library of external functions that implement the semantics of , , , , and , using the underlying domain-specific perspective functions. The modeller simply needs to provide the perspective function for their domain, if a suitable one is not already present in our library.

For example, in Figure 1, global state covers the whole flat field. is the blue area, and is the yellow area, which means in agent ’s perspective, the world is the blue part, and for agent , the world is only the yellow part. Further more, in agent ’s perspective, what agent sees, can be represented by the intersection between those two coloured areas, which is actually . The interpretation is that, agent only considers state is the “global state” for him, and inside that state, agent ’s perspective is .

To be more specific about perspective functions, assume the global state contains all variables for { }, such as locations, the directions agents are facing, etc. Based on the current set up from Figure 1, we can implement for any agent with the following Euclidean geometric calculation:


The perspective function takes all agents’ locations, directions and vision angles, along with target location as input, and returns those agents and objects who fall instead these regions. On each variable for both agents, and . Our library can then reason about knowledge operator.

While such an approach could be directly encoded using proposition in classical planning, we assert that the resulting encoding would be tediously difficult and error prone.

Their distributed knowledge would be derived from the union over their perspectives, which is the whole world except all variables for . Both agents would know every variables in the intersection of their perspectives, {}, while it is also the same as what they commonly knows.

However, if we alter the scenario a bit by turning , then the would work similar as one level perspective functions, while is empty. In the new scenario, would become {}, and becomes {}. When finding the converged perspective for and , stays the same, but the perspective for results in an empty world. The intersection over their perspective does not have , which means does not exist in at least one of the agent’s () perspectives from the group.

However, the above implementation of the perspective function is only useful for Euclidean planes. One of the novelties in this paper is that we define our implementation to support arbitrary perspective functions, which can be provided by the modeller as new external functions. Therefore, an agent’s perspective function would be able to handle any problem with a proper set of visibility rules to be defined. Basically, in our implementation, a perspective function can take any state variable from the domain model, and converts it into its the agent’s perspective state. Then, following the property that , the function applies the domain-specific visibility rules on all the variables to gain the agent’s local state. From there, the semantics outlined in Section 3 are handled by our library.

Therefore, we can simply define our external function on different epistemic problem by providing different implementations of . For example, the only difference between the external function’s implementation on Corridor and Grapevine benchmarks from Section 5, is that in Corridor, the visibiility rules are that an agent can see within the current room and the adjacent room, while in Grapevine, the agent can only see within the current room. Moreover, in order to implement epistemic planning problems for e.g. a thtee-dimensional Euclidean plane, we need to modify the perspective functions based on the geometric model.

From a practical perspective, this means that modellers provide: (1) a planning model that uses epistemic formula; and (2) domain-specific implementations for . Linking the epistemic formula in the planning model to the perspective functions is delegated to our library.

4.2.2 Domain dependent functions

Domain dependent functions are customised relations for each set of problems, corresponding to in Section 3, and can be any domain specific functions that are implementable as external functions.

Overall, the implementation of our model is not as direct as other classical F-STRIPS planning problems or epistemic planning problems. However, as we demonstrate in the next section, it is scalable and flexible enough to find valid solutions to many epistemic planning problems.

5 Experiments & Results

In this section, we evaluate our approach on four problems: Corridor Kominis and Geffner (2015), Grapevine Muise et al. (2015a), Big Brother Logic (BBL) Gasquet et al. (2014), and Social-media Network (SN). Corridor and Grapevine are well-known epistemic planning problems, which we use to compare actual performance of our model against an existing state-of-the-art planner. BBL is a model of the Big Brother Logic in a two-dimensional continuous domain, which we used to demonstrate the expressiveness of our model. In addition, to demonstrate our model’s capability of reasoning about group knowledge, we inherit the classical Gossip Problem, and create our own version, called Social-media Network. Hereafter, we assume any knowledge formula is supplied with correct value , which means its equivalent with , unless the value is specified.

5.1 Benchmark problems

In this section, we briefly describe the corridor and grapevine problems, which are benchmark problems that we use to compare against Muise et al.’s epistemic planner Muise et al. (2015a), which compiles to classical planning.


The corridor problem was originally presented by Kominis and GeffnerKominis and Geffner (2015). It is about selective communication among agents. The basic set up is in a corridor of rooms, in which there are several agents. An agent is able to move around adjacent rooms, sense the secret in a room, and share the secret. The rule of communication is that when an agent shares the secret, all the agents in the same room or adjacent rooms would know.

The goals in this domain are to have some agents knowing the secret and other agents not knowing the secret. Thus, the main agent needs to get to the right room and communication to avoid the secret being overheard.


Grapevine, proposed by Muise et al.Muise et al. (2015a), is a similar problem to Corridor. With only two rooms available for agents, the scenario makes sharing secrets while hiding from others more difficult. The basic set up is each agent has their own secret, and they can share their secret among everyone in the same room. Since there are only two rooms, the secret is only shared within the room. The basic actions for agents are moving between rooms and sharing his secret.

To evaluate computational performance of our model, we compare to Muise et al.Muise et al. (2015a)’s planner. They have several test cases for Corridor and Grapevine. In addition, to test how the performance is influenced by the problem, we created new problems that varied some of the parameters, such as the number of agents, the number of goal conditions and also depth of epistemic relations.

The planner converts epistemic planning problems into classical planning, which results in a significant number of propositions when the depth or number of agents increased. We tried to submit the converted classical planning problem to the same planner that our model used, planner, to maintain a fair comparison. However, since the computational cost of the novelty check in planner increases with size of propositions, the planning costs was prohibitively expensive. Therefore, for comparison, we use the default ff planner that is used by Muise et al..

We ran the problems on both planners using a Linux machine with 1 CPU and 10 gigabyte memory. We measure the number of atoms (fluents) and number of nodes generated during the search to compare the size of same problem modelled by different methods. We also measured the total time for both planners to solve the problem, and the time they taken to reasoning about the epistemic relations, which correspond to the time taken to call external functions for our solution (during planning), and the time taken to convert the epistemic planning problem into classical planning problem in the solution (before planning).

Problem Parameters Our Model PDKB
Calls Total Compile Total
Table 1: Results for the Corridor and Grapevine Problems

We show the results of the problems in Table 1, in which specifies the number of agents, the maximum depth of a nested epistemic query, the number of goals, the number of atomic fluents, the number of generated nodes in the search, and the number of calls made to external functions.

From the results, it is clear that the complexity of the approach grows exponentially on both number of the agents and depth of epistemic relations (we ran out of memory in the final Grapevine problem), while in our approach, those features do not have a large affect. However, epistemic reasoning in our approach (calls to the external solver), has a significant influence on the performance for our solution. Since the F-STRIPS planner we use checks each goal at each node in the search, the complexity is in . While this is exponential in the size of the original problem (because is exponential), the compuational cost is significantly lower than the compilation in the approach.

5.2 Big Brother Logic

Big Brother Logic (BBL) is a problem that first discussed by Gasquet et al. Gasquet et al. (2014). The basic environment is on a two-dimensional space called “Flatland” without any obstacles. There are several stationary and transparent cameras; that is, cameras can only rotate, and do not have volume, so they do not block others’ vision. In our scenario, we allow cameras to also move in Flatland.

5.2.1 Examples

Let and be two cameras in Flatland. Camera is located at , and camera at . Both cameras have an range. Camera is facing north-east, while camera is facing south-west. There are three objects with values , and , located at , and respectively. For simplicity, we assume only camera can move or turn freely, and camera , , and are fixed. Figure 2 visualises the problem set up.