Landmark-Based Approaches for Goal Recognition as Planning

04/26/2019 ∙ by Ramon Fraga Pereira, et al. ∙ University of Aberdeen PUCRS 0

The task of recognizing goals and plans from missing and full observations can be done efficiently by using automated planning techniques. In many applications, it is important to recognize goals and plans not only accurately, but also quickly. To address this challenge, we develop novel goal recognition approaches based on planning techniques that rely on planning landmarks. In automated planning, landmarks are properties (or actions) that cannot be avoided to achieve a goal. We show the applicability of a number of planning techniques with an emphasis on landmarks for goal and plan recognition tasks in two settings: (1) we use the concept of landmarks to develop goal recognition heuristics; and (2) we develop a landmark-based filtering method to refine existing planning-based goal and plan recognition approaches. These recognition approaches are empirically evaluated in experiments over several classical planning domains. We show that our goal recognition approaches yield not only accuracy comparable to (and often higher than) other state-of-the-art techniques, but also substantially faster recognition time over such techniques.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

As more computer systems require reasoning about what agents (both human and artificial) other than themselves are doing, the ability to accurately and efficiently recognize goals and plans from agent behavior becomes increasingly important. Goal and plan recognition is the task of recognizing goals and plans based on often incomplete observations that include actions executed by agents and properties of agent behavior in an environment [Sukthankar, Goldman, Geib, Pynadath,  BuiSukthankar et al.2014]. Most goal and plan recognition approaches [Geib  GoldmanGeib  Goldman2005, Avrahami-Zilberbrand  KaminkaAvrahami-Zilberbrand  Kaminka2005, Geib  GoldmanGeib  Goldman2009, Mirsky, Stern, Gal,  KalechMirsky et al.2016, Mirsky, Stern, Ya’akov (Kobi) Gal,  KalechMirsky et al.2017] employ plan libraries to represent agent behavior, i.e., a plan library with plans for achieving goals, resulting in approaches to recognize plans that are analogous to parsing. Recent work [Ramírez  GeffnerRamírez  Geffner2009, Ramírez  GeffnerRamírez  Geffner2010, Pattison  LongPattison  Long2010, Keren, Gal,  KarpasKeren et al.2014, E-Martín, R.-Moreno,  SmithE-Martín et al.2015, Sohrabi, Riabov,  UdreaSohrabi et al.2016, Pereira  MeneguzziPereira  Meneguzzi2016, Pereira, Oren,  MeneguzziPereira et al.2017] use a planning domain definition (a domain theory) to represent potential agent behavior, bringing goal and plan recognition closer to planning algorithms. These approaches allow techniques used in planning algorithms to be employed for recognizing goals and plans requiring less domain information. Recognizing goals and plans are important in applications for monitoring and anticipating agent behavior in an environment, including crime detection and prevention [Geib  GoldmanGeib  Goldman2001], monitoring activities in elderly-care [GeibGeib2002], recognizing plans in educational environments [Uzan, Dekel, Seri,  GalUzan et al.2015] and exploratory domains [Mirsky, Gal,  ShieberMirsky et al.2017a], and traffic monitoring [Pynadath  WellmanPynadath  Wellman2013], among others [Geib  GoldmanGeib  Goldman2001, Granada, Pereira, Monteiro, Barros, Ruiz,  MeneguzziGranada et al.2017, Mirsky, Gal,  TolpinMirsky et al.2017b].

We develop recognition approaches that are based on planning techniques (without pre-defined static plan libraries) that rely on planning landmarks [Hoffmann, Porteous,  SebastiaHoffmann et al.2004]

, namely, landmark-based approaches for goal recognition. In automated planning, landmarks are properties (or actions) that every plan must satisfy (or execute) at some point in every plan execution to achieve a goal. Whereas in planning algorithms landmarks are used to focus search, in this work, landmarks allow our recognition approaches to reason about what cannot be avoided for achieving goals, substantially speeding up recognition time. Thus, we provide novel contributions to efficiently solve goal recognition problems, as follows. First, we provide two contributions for goal recognition techniques. We develop two novel recognition heuristics that rely on landmarks and obviate the need to execute a planner multiple times yielding substantial runtime gains. Our initial heuristic estimates goal completion by considering the ratio between achieved and extracted landmarks of a candidate goal. We expand this heuristic to use a

landmark uniqueness value, representing the information value of the landmark for some specific candidate goal when compared to landmarks for all candidate goals. Second, we also develop a filtering method that rules out candidate goals by estimating how many landmarks required by every goal in the set of candidate goals have been reached within a sequence of observations. This filtering method can be applied to other planning-based goal and plan recognition approaches, such as the approaches from Ramírez and Geffner (RamirezG_IJCAI2009,RamirezG_AAAI2010) (with a probabilistic ranking), as well as from Sohrabi et al. (Sohrabi_IJCAI2016).

Our use of landmarks to drive goal recognition stems from properties of landmarks in classical planning. First, they are necessary conditions to achieving goals, and thus provide very strong evidence that certain observations are tied to specific goals. Second, although their computation is, in theory, expensive, in practice, we can efficiently compute very informative sets of ordered landmarks, and critically, only once per goal recognition problem, resulting in a very efficient overall algorithm.

We prove key properties of our recognition heuristics and their use as a filtering mechanism, and evaluate empirically our approaches using a set of well-known domains from the International Planning Competition (IPC), as well as a number of domains we developed specifically to measure the scalability of goal and plan recognition algorithms. For all domains, we evaluate the algorithms using datasets with varying degrees of observability (missing observations) and noise (spurious observations). We compare our heuristics for goal recognition against the current state-of-the-art [Ramírez  GeffnerRamírez  Geffner2009, Ramírez  GeffnerRamírez  Geffner2010, E-Martín, R.-Moreno,  SmithE-Martín et al.2015, Sohrabi, Riabov,  UdreaSohrabi et al.2016] by using a dataset developed by Ramírez and Geffner (RamirezG_IJCAI2009,RamirezG_AAAI2010), and a new dataset we generated for other planning domains with larger and more complex problems, as well as problems with missing and noisy observations. Experiments show that our recognition heuristics are substantially faster and more accurate than the state-of-the-art for datasets that contain several domains and problems where recognizing the intended goal is non-trivial.

The remainder of this article is organized as follows. Section 2 provides background on planning, domain-independent heuristics, and landmarks. We proceed to describe how we extract useful information from planning domain definitions in Section 3, which we use throughout the article. In Section 4, we develop our goal recognition approaches using landmarks. We empirically evaluate our approaches in Section 5, which shows the results of the experiments for our goal recognition approaches against the state-of-the-art. In Section 6, we survey related work and compare the state-of-the-art with our approaches. Finally, in Section 7, we conclude this article by discussing limitations, advantages and future directions of our approaches.

2 Background

In this section, we review essential background on planning terminology and landmarks. Finally, we define the task of goal recognition over planning domain definitions.

2.1 Planning

Planning is the problem of finding a sequence of actions (i.e., plan) that achieves a particular goal from an initial state. In this work, we use the terminology from Ghallab et al. (AutomatedPlanning_Book2016) to represent planning domains and problems. First, we define a state in the environment by the following Definition 1.

Definition 1 (Predicates and State).

A predicate is denoted by an n-ary predicate symbol applied to a sequence of zero or more terms (, , …, ) – terms are either constants or variables. We refer to grounded predicates that represent logical values according to some interpretation as facts, which are divided into two types: positive and negated facts, as well as constants for truth () and falsehood (). A state is a finite set of positive facts that follows the closed world assumption so that if , then is true in . We assume a simple inference relation such that iff , iff , and iff .

Planning domains describe the environment dynamics through operators, which use a limited first-order logic representation to define schemata for state-modification actions according to Definition 2.

Definition 2 (Operator and Action).

An operator is represented by a triple name(), pre(), eff(): name() represents the description or signature of ; pre() describes the preconditions of , a set of predicates that must exist in the current state for to be executed; eff() represents the effects of . These effects are divided into eff() (i.e., an add-list of positive predicates) and eff() (i.e., a delete-list of negated predicates). An action is a ground operator instantiated over its free variables.

We say an action is applicable to a state if and only if , and generates a new state such that .

Definition 3 (Planning Domain).

A planning domain definition is represented by a pair , which specifies the knowledge of the domain, and consists of a finite set of facts (e.g., environment properties) and a finite set of actions .

A planning instance, comprises both a planning domain and the elements of a planning problem, describing a finite set of objects of the environment, the initial state, and the goal state which an agent wishes to achieve, as formalized in Definition 4.

Definition 4 (Planning Instance).

A planning instance is represented by a triple .

  • is the domain definition;

  • is the initial state specification, which is defined by specifying the value for all facts in the initial state; and

  • is the goal state specification, which represents a desired state to be achieved.

Classical planning representations often separate the definition of and as part of a planning problem to be used together with a domain , such as STRIPS [Fikes  NilssonFikes  Nilsson1971] and PDDL [McDermott, Ghallab, Howe, Knoblock, Ram, Veloso, Weld,  WilkinsMcDermott et al.1998]. In this work, we use the STRIPS fragment of PDDL to formalize planning domains and problems. Finally, a plan is the solution of a planning instance, as formalized in Definition 5.

Definition 5 (Plan).

A plan for a planning instance is a sequence of actions , , …, that modifies the initial state into a state in which the goal state holds by the successive execution of actions in a plan . A plan with length is optimal if there exists no other plan for such that .

While actions have an associated cost, we take the assumption from classical planning that this cost is 1 for all instantiated actions. A plan is considered optimal if its cost, and thus length, is minimal.

Finally, modern classical planners use a variety of heuristics to efficiently explore the search space of planning domains by estimating the cost to achieve a specific goal [Ghallab, Nau,  TraversoGhallab et al.2016]. In classical planning, this estimate is often the number of actions to achieve the goal state from a particular state, so we describe all techniques assuming a uniform action cost for all . Thus, the cost for a plan is . Heuristics make no guarantees of the accuracy of their estimations, however, when a heuristic never overestimates the cost to achieve a goal, it is called admissible and guarantees optimal plans for certain search algorithms. A heuristic () is admissible if () *() for all states, where *() is the optimal cost to the goal from state . Heuristics that overestimate the cost to achieve a goal are called inadmissible.

2.2 Landmarks

In the planning literature [Richter, Helmert,  WestphalRichter et al.2008], landmarks are defined as necessary properties (alternatively, actions) that must be true (alternatively, executed) at some point in every valid plan (see Definition 5) to achieve a particular goal, being often partially ordered following the sequence in which they must be achieved. Hoffman et al. (Hoffmann2004_OrderedLandmarks) define fact landmarks, and Vidal and Geffner (Vicent_ActionLandmarks_2005) define action landmarks, as follows:

Definition 6 (Fact Landmark).

Given a planning instance , a formula is a landmark in iff is true at some point along all valid plans that achieve from . In other words, a landmark is a type of formula (e.g., conjunctive formula or disjunctive formula) over a set of facts that must be satisfied (or achieved) at some point along all valid plan executions.

Definition 7 (Action Landmark).

Given a planning instance , an action is a landmark in iff is a necessary action that must be executed at some point along all valid plans that achieve from .

From the concept of fact landmarks, Hoffmann et al. (Hoffmann2004_OrderedLandmarks) introduce two types of landmarks as formulas: conjunctive and disjunctive landmarks. A conjunctive landmark is a set of facts that must be true together at some point in every valid plan to achieve a goal. A disjunctive landmark is a set of facts such that at least one of the facts must be true at some point in every valid plan to achieve a goal. Figure 1 shows an example that illustrates a set of landmarks for a Blocks-World111Blocks-World is a classical planning domain where a set of stackable blocks must be re-assembled on a table [Ghallab, Nau,  TraversoGhallab et al.2004]. problem instance. This example shows a set of conjunctive ordered landmarks (connected boxes) that must be true to achieve the goal state (on A B). For instance, to achieve the fact landmark (on A B) which is also the goal state, the conjunctive landmark (and (holding A) (clear B)) must be true immediately before, and so on, as shown in Figure 1.

Figure 1: Ordered landmarks for a Blocks-World problem instance.

Whereas in planning the concept of landmarks is used to build heuristics [Richter, Helmert,  WestphalRichter et al.2008] and planning algorithms [Richter  WestphalRichter  Westphal2010], in this work, we propose a novel use for landmarks: to monitor an agent’s plan execution. Intuitively, we use landmarks as waypoints (or stepping stones) in order to monitor what an observed agent cannot avoid to achieve its goals.

2.3 Goal Recognition

Goal recognition is the task of recognizing agents’ goals by observing their interactions in an environment [Sukthankar, Goldman, Geib, Pynadath,  BuiSukthankar et al.2014]. In goal recognition, such observed interactions (i.e., observations) comprise the evidence available to recognize goals. Definition 8 follows the formalism proposed by Ramírez and Geffner in (RamirezG_IJCAI2009,RamirezG_AAAI2010) characterizing an observation sequence as the result of action sequence.

Definition 8 (Observation Sequence).

An observation sequence is said to be satisfied by a plan , if there is a monotonic function that maps the observation indices into action indices , such that .

By combining the various notions of planning problem and an observation sequences, we formally define a goal recognition problem over a planning domain definition following Ramírez and Geffner (RamirezG_IJCAI2009)222

Unlike the probabilistic approach developed by Ramírez and Geffner (RamirezG_AAAI2010), our heuristic approaches do not use any prior probabilities to perform the goal recognition process.

in Definition 9, and a weak notion of solution to that problem in Definition 10.

Definition 9 (Goal Recognition Problem).

A goal recognition problem is a tuple , where:

  • is a planning domain definition;

  • is the initial state;

  • is the set of possible goals, which include a correct hidden goal (i.e., ); and

  • is an observation sequence of executed actions, with each observation , and the corresponding action being part of a valid plan (from Definition 5) that transitions into through the sequential execution of actions in .

Definition 10 (Solution to a Goal Recognition Problem).

A solution to a goal recognition problem is a nonempty subset of the set of possible goals such that there exists a plan generated from a planning instance and is consistent with .

Thus, the ideal solution for a goal recognition problem is a set containing only the correct hidden goal that the observation sequence of a plan execution achieves. As an example of how the goal recognition process works, consider the Example 1, as follows.

Example 1.

To exemplify the goal recognition process, let us consider the Blocks-World example in Figure 2. The initial state represents an initial configuration of stackable blocks, while the set of candidate goals is composed by the following stacked “words”: RED, BED, and SAD. Consider an observation sequence for a hidden goal RED consisting of the following action sequence: (unstack D B), (putdown D), (unstack E A), (stack E D), (pickup R), (stack R E). By following the full plan, we can easily infer that the hidden goal is indeed RED. However, if the we cannot observe action (stack R E), it is not trivial to infer that RED is indeed the goal the observation sequence aims to achieve.

Figure 2: Blocks-World example.

Like [Sohrabi, Riabov,  UdreaSohrabi et al.2016], we also deal with missing observations during the goal recognition process. We differ from [Sohrabi, Riabov,  UdreaSohrabi et al.2016] in that we do not deal with noisy (unreliable) observations explicitly. Nevertheless our technique proves to be robust against noise by relying exclusively on necessary conditions for the plans leading to each goal as our empirical analysis corroborates. In a partial observation sequence, we observe only a sub-sequence of actions of a plan that achieves a particular goal because some actions are missing or obfuscated. A noisy observation sequence contains one or more actions (or a set of facts) that might not be part of a plan that achieves a particular goal, e.g., when a sensor fails and generates abnormal or spurious readings. We formalize the way in which an environment generates observations of agent plans in Definition 11.

Definition 11 (Observation Sequence Generation).

Let be a plan generated by a planning instance . An action projection function is a function that maps actions to sequences of zero or more actions. An observation sequence generation function is a function that maps a plan into an observation sequence as follows:

The key to generating such sequences is how the rules for function to translate actions. Following our Example 1, we could define an action projection function that never generates observations for unstack actions, and generates noise for all stack actions as follows.

We formally define missing and noisy observations in Definitions 12 and 13 and both types of observation are exemplified in Example 2.

Definition 12 (Missing Observation).

Let be a planning instance, be a valid plan that achieves from , and is an observation sequence induced by an observation generation function with an action projection function . An observation sequence misses observations (is a partial or incomplete observation sequence) with respect to the plan that achieves the goal from if the function maps any action into the empty sequence .

Definition 13 (Noisy Observation).

Let be a planning instance, be a valid plan that achieves from , and is an observation sequence induced by an observation generation function with an action projection function . An observation sequence contains noisy observations with respect to the plan that achieve the goal from if the function maps any action into a non-empty sequence containing one or more action .

Example 2.

Let us consider that a valid plan to achieve a goal is . Consider the following observation sequences , , and :

  • ;

  • ; and

Observation sequences and satisfy Definition 12, and therefore, they are partial observation sequences and contain missing observed actions. is not a partial observation sequence because it does not satisfy Definition 12 as the observation sequence is not a strict subset of ordered actions of the plan .

Now, consider the following observation sequences and :

  • ; and

It is possible to see that both observation sequences and contain noisy observations (g and h respectively) and satisfy Definition 2. However, note that contains not only noisy observations but it also misses observations, i.e., is partial observation sequence with noisy observations.

Although we define missing and noisy observations with actions as observations, our goal recognition approaches can also deal with facts (or fluents) as observations, like [Sohrabi, Riabov,  UdreaSohrabi et al.2016]. Indeed, as we see in Section 4, using states as observations makes goal recognition much easier for our heuristic approaches, since we can use the observations directly to compute achieved landmarks. In Section 4, we show that what matters for our goal recognition approaches is the evidence of fact landmarks during the observations, and it is irrelevant whether this evidence is provided by either an observed action or a set of facts.

3 Extracting Recognition Information from Planning Definition

In this section, we describe the process through which we extract useful information for goal recognition from planning domain definition. First, we describe landmark extraction algorithms from the literature, and how we use these algorithms in our approaches in Section 3.1

. Second, we show how we classify facts into partitions from planning action descriptions and how we use them during the goal recognition process in Section 

3.2.

3.1 Extracting Landmarks

Hoffman et al. (Hoffmann2004_OrderedLandmarks) proves that the process of generating exactly all landmarks and deciding about their ordering is PSPACE-complete, which is exactly the same complexity of deciding plan existence [BylanderBylander1994]. Nevertheless, most landmark extraction algorithms extract only a subset of landmarks for a given planning instance for efficiency. While there are several algorithms to extract landmarks and their orderings in the literature that we could use [Zhu  GivanZhu  Givan2003, Richter, Helmert,  WestphalRichter et al.2008, Keyder, Richter,  HelmertKeyder et al.2010], we chose the landmark extraction algorithm from Hoffmann et al. (Hoffmann2004_OrderedLandmarks) to extract landmarks from planning instances due to its speed and simplicity. This algorithm can extract both conjunctive and disjunctive landmarks, but we use the conjunctive landmarks to build heuristics for our goal recognition approaches.

To represent landmarks and their ordering, the algorithm of Hoffmann et al. (Hoffmann2004_OrderedLandmarks) uses a tree in which nodes represent landmarks and edges represent necessary prerequisites between landmarks. Each node in the tree represents a conjunction of facts that must be true simultaneously at some point during plan execution, and the root node is a landmark representing the goal state. This algorithm uses a Relaxed Planning Graph (RPG) [Hoffmann  NebelHoffmann  Nebel2001], which is a leveled graph that ignores the delete-list effects of all actions, thus containing no mutex relations. Once the RPG is built, the algorithm extracts landmark candidates by back-chaining from the RPG level in which all facts of the goal state are possible, and, for each fact in , checks which facts must be true until the first level of the RPG. For example, if fact is a landmark and all actions that achieve share as precondition, then is a landmark candidate. To confirm that a landmark candidate is indeed a landmark, the algorithm builds a new RPG structure by removing actions that achieve this landmark candidate and checks the solvability over this modified problem333Deciding the solvability of a relaxed planning problem using an RPG structure can be done in polynomial time [Blum  FurstBlum  Furst1997]., and, if the modified problem is unsolvable, then the landmark candidate is a necessary landmark. This means that the actions that achieve the landmark candidate are necessary to solve the original planning problem.

Figure 3: Logistics problem example.

To exemplify the output of the landmark extraction algorithm from [Hoffmann, Porteous,  SebastiaHoffmann et al.2004], consider the Logistics444The Logistics domain consists of airplanes and trucks transporting packages between locations (e.g., airports and cities). problem example in Figure 3. Fact landmarks extracted for this example are shown respectively in Listing 1 and Figure 4. While Listing 1 shows one possible serialization of the landmarks, Figure 4 represents the same landmarks ordered from bottom-up by facts that must be true together. These landmarks allow us to monitor way-points during a plan execution to determine which goals this plan is going to achieve.

Figure 4: Ordered fact landmarks extracted from Logistics problem example shown in Figure 3. Fact landmarks that must be true together are represented by connected boxes, which are conjunctive facts, i.e., representing conjunctive landmarks. Disjunctive landmarks are represented by octagon boxes that are connected by dashed lines.
Fact Landmarks:
(and (at BOX AIRPORT-E))
(and (at PLANE AIRPORT-E) (in BOX PLANE))
(and (at PLANE AIRPORT-C) (at BOX AIRPORT-C))
(and (at PLANE AIRPORT-E))
(and (at TRUCK D))
(and (in BOX TRUCK) (at TRUCK AIRPORT-C))
(and (at BOX B) (at TRUCK B))
(or (at TRUCK A) (at TRUCK C) (at TRUCK D))
Listing 1: Fact landmarks (conjunctive and disjunctive) extracted from the Logistics example in Figure 3.

This landmark extraction algorithm is referred to as function ExtractLandmarks, which takes as input a planning domain definition , an initial state , and a set of candidate goals or a single goal . In case the input is a set of candidate goals , this function outputs a map that associates candidate goals to their respective ordered fact landmarks (i.e., a set of landmarks with an order relation). Alternatively, in case the input is a single goal , this function outputs a map that associates the goal to its respective ordered fact landmarks.

We note that many landmark extraction techniques, including that of Hoffmann et al. [Hoffmann, Porteous,  SebastiaHoffmann et al.2004], might infer incorrect landmark orderings, which can lead to problems if the goal recognition process relies on the ordering information to make inferences. Nevertheless, our empirical evaluation shows that landmark orderings do not affect detection performance in the experimental datasets. We discuss landmark orderings later (Section 4.1) in the paper.

3.2 Classifying Facts into Partitions

Pattison and Long (PattisonGoalRecognition_2010) classify facts into mutually exclusive partitions in order to infer whether certain observations are likely to be goals for goal recognition. Their classification relies on the fact that, in some planning domains, predicates may provide additional information that can be extracted by analyzing preconditions and effects in operator definitions. We use this classification to infer if certain observations are consistent with a particular goal, and if not, we can eliminate a candidate goal. We formally define fact partitions in what follows.

Definition 14 (Strictly Activating).

A fact is strictly activating if and , . Furthermore, , such that pre().

Definition 15 (Unstable Activating).

A fact is unstable activating if and , and and .

Definition 16 (Strictly Terminal).

A fact is strictly terminal if , such that and , and .

A Strictly Activating fact (Definition 14) appears as a precondition, and does not appear as an add or delete effect in an operator definition. This means that unless defined in the initial state, this fact can never be added or deleted by an operator. An Unstable Activating fact (Definition 15) appears as both a precondition and a delete effect in two operator definitions, so once deleted, this fact cannot be re-achieved. The deletion of an unstable activating fact may prevent a plan execution from achieving a goal. A Strictly Terminal fact (Definition 16) does not appear as a precondition of any operator definition, and once added, cannot be deleted. For some planning domains, this kind of fact is the most likely to be in the set of goal facts, because once added in the current state, it cannot be deleted, and remains true until the final state.

The fact partitions that we can extract depend on the planning domain definition. For example, from the Blocks-World domain, it is not possible to extract any fact partitions. However, it is possible to extract fact partitions from the IPC-Grid555IPC-Grid domain consists of an agent that moves in a grid using keys to open locked locations. domain, such as Strictly Activating and Unstable Activating facts. In this work, we use fact partitions to obtain additional information on fact landmarks during the goal recognition process. For example, consider an Unstable Activating fact landmark , so that if is deleted from the current state, then it cannot be re-achieved. We can trivially determine that goals for which this fact is a landmark are unreachable, because there is no available action that achieves again.

4 Landmark-Based Goal Recognition

We now describe our goal recognition approaches that rely on planning landmarks. First, we start with a method to monitor and compute the evidence of landmarks from observations in Section 4.1. Second, we develop a landmark-based filtering method that can be used with any other planning-based goal and plan recognition approach in Section 4.2. Finally, we describe how we build goal recognition heuristics using landmarks in Sections 4.3 and 4.4.

4.1 Computing Achieved Landmarks in Observations

An essential part of our approaches to goal recognition is the ability to monitor and compute the evidence of achieved fact landmarks in the observations. To do so, we compute the evidence of achieved fact landmarks in preconditions and effects of observed actions during a plan execution [Pereira, Oren,  MeneguzziPereira et al.2017] using the ComputeAchievedLandmarks function shown in Algorithm 1. This algorithm takes as input an initial state , a set of candidate goals , a sequence of observed actions , and a map containing candidate goals and their extracted fact landmarks (provided by the ExtractLandmarks function). Note that Algorithm 1 can be easily modified to allow it to deal with observations as states, so instead of analyzing preconditions and effects of actions, we compare the observations directly to computed landmarks.

Input: initial state, set of candidate goals, observations, and goals and their extracted landmarks.
Output: A map of goals to their achieved landmarks.

1:function ComputeAchievedLandmarks()
2:      Map goals to their respective achieved landmarks.
3:     for each goal in  do
4:          fact landmarks of s.t in
5:          all fact landmarks
6:         
7:         for each observed action in  do
8:               all fact landmarks in such that pre() eff() and
9:               predecessors of all in , such that
10:              
11:         end for
12:          Achieved landmarks of .
13:     end for
14:     return
15:end function
Algorithm 1 Compute Achieved Landmarks in Observations.

Algorithm 1 iterates over the set of candidate goals (Line 3) selecting the fact landmarks of each goal in in Line 4 and computes the fact landmarks that are in the initial state in Line 5. With this information, the algorithm iterates over the observed actions to compute the achieved fact landmarks of in Lines 7 to 10. For each observed action in , the algorithm computes all fact landmarks of that are either in the preconditions and effects of in Line 8. As we deal with partial observations in a plan execution some executed actions may be missing from the observation sequence, thus whenever we identify a fact landmark, we also infer that its predecessors must have been achieved in Line 9. For example, consider that the set of fact landmarks to achieve a goal from a state is represented by the following ordered facts: (at A) (at B) (at C) (at D), and we observe just one action during a plan execution, and this observed action contains the fact landmark (at C) as an effect. From this observed action, we can infer that the predecessors of (at C) must have been achieved before this observation (i.e., (at A) and (at B)), and therefore, we also include them as achieved landmarks. At the end of each iteration over an observed action , the algorithm stores the set of achieved landmarks of in in Line 10. Finally, after computing the evidence of achieved landmarks in the observations for a candidate goal , the algorithm stores the set of achieved landmarks of in (Line 12) and returns a map containing all candidate goals and their respective achieved fact landmarks (Line 14). Example 3 illustrates the execution of Algorithm 1 to compute achieved landmarks from the observations of our running example.

Example 3.

Consider the Blocks-World example from Figure 2, and the following observed actions: (unstack E A) and (stack E D). Thus, from these observed actions, the candidate goal RED, and the set of fact landmarks of this candidate goal (Figure 5), our algorithm computes that the following fact landmarks have been achieved:

  • [(clear R)], [(on E D)],
    [(clear R) (ontable R) (handempty)],
    [(on E A) (clear E) (handempty)],
    [(clear D) (holding E)],
    [(on D B) (clear D) (handempty)]

In the preconditions of (unstack E A) the algorithm computes [(on E A) (clear E) (handempty)]. Subsequently, in the preconditions and effects of (stack E D) the algorithm computes [(clear D) (holding E)] and [(on E D)], while it computes the other achieved landmarks for the word RED from the initial state. Figure 5 shows the set of achieved landmarks for the word RED in gray. Listing 2 shows in bold the set of achieved landmarks that our algorithm computes for the set of candidate goals in Figure 2.

Figure 5: Ordered fact landmarks extracted for the stacked blocks for the word RED. Fact landmarks that must be true together are represented by connected boxes. Connected boxes in grey represent achieved fact landmarks. Edges represent prerequisites between landmarks.

The complexity of computing achieved landmarks in observations (Algorithm 1) with the process of extracting landmarks () is: , where is the set of candidate goals, is the observation sequence, and is the extracted landmarks for .

4.2 Filtering Candidate Goals from Achieved Landmarks in Observations

We now develop an approach to filter candidate goals based on the evidence of fact landmarks and partitioned facts in preconditions and effects of observed actions in a plan execution [Pereira  MeneguzziPereira  Meneguzzi2016]. This filtering method analyzes fact landmarks in preconditions and effects of observed actions, and selects goals, from a set of candidate goals, that have achieved most of their associated landmarks.

This filtering method is detailed in function FilterCandidateGoals of Algorithm LABEL:alg:filterCandidateGoals. This algorithm takes as input a goal recognition problem , which is composed of a planning domain definition , an initial state , a set of candidate goals , a set of observed actions , and a filtering threshold . Our algorithm iterates over the set of candidate goals , and, for each goal in , it extracts and classifies fact landmarks and partitions for from the initial state (Lines LABEL:alg:filter:extractLandmarks and LABEL:alg:filter:factPartitioner). Function takes a set of goals and the actions in the domain and returns the fact partitions induced by the actions in into the sets of strictly activating (from Definition 14), of unstable activating (from Definition 15), and of strictly terminal (from Definition 16) facts. We then check whether the observed actions contain fact landmarks or partitioned facts in either their preconditions or effects. At this point, if any Strictly Activating facts for the candidate goal are not in initial state , then the candidate goal is no longer achievable, so we can discard it (Line LABEL:alg:filter:SA). Subsequently, we check for Unstable Activating and Strictly Terminal facts of goal in the preconditions and effects of the observed actions , and if we find any, we discard the candidate goal (Line LABEL:alg:filter:STandUA). If we observe no facts from partitions as evidence from the observed actions in , we move on to checking landmarks of within the observed actions in . If we observe any landmarks in the preconditions and positive effects of the observed actions (Line LABEL:alg:filter:identifyFactLandmarks), we compute the evidence of achieved landmarks for the candidate goal (Line LABEL:alg:filter:computeFactLandmarks). Like Algorithm 1, we deal with missing observations by inferring that the unobserved predecessors of observed fact landmarks must have been achieved in Line LABEL:alg:filter:predecessors. Given the number of achieved fact landmarks of , we then estimate the percentage of fact landmarks that the observed actions have achieved according to the ratio between the amount of achieved fact landmarks and the total amount of landmarks (Line LABEL:alg:filter:ratio). Finally, after computing the percentage of landmark completion for all candidate goals in , we return the goals with the highest percentage of achieved landmarks within our filtering threshold (Line LABEL:alg:filter:filterGoals). We follow Definition 6 of fact landmarks and consider conjunctive landmarks as a single landmark when counting achieved landmarks (Line LABEL:alg:filter:ratio), except for the sub-goals, where each fact is a separate landmark. With respect to the threshold value, note that, if threshold , the filter returns only the goals with maximum completion, given the observations. The threshold gives us flexibility when dealing with missing observations and sub-optimal plans, which, when , it may cause some potential candidate goals to be filtered out before we get additional observations. Example 4 shows how our filtering method prunes efficiently goals from a set of candidate goals.

Example 4.

Consider the Blocks-World example shown in Figure 2 and that the following actions have been observed in the plan execution: (unstack E A) and (stack E D). Using , Algorithm LABEL:alg:filterCandidateGoals returns just the goal RED because this goal has achieved 6 out of 10 fact landmarks, so it is the goal in the set of candidate goals with the highest percentage of achieved landmarks in observations. From the first observed action (unstack E A), the algorithm computes in its preconditions the following fact landmark:

  • fact landmarks in preconditions: [(on E A) (clear E)
    (handempty)]
    ;

Subsequently, the second observed action (stack E D) has in its preconditions and effects the following fact landmarks:

  • fact landmarks in preconditions: [(clear D) (holding E)]; and

  • fact landmarks in effects: [(on E D)] (which is also a sub-goal);

From the initial state, it is also possible to compute the following set of achieved fact landmarks:

  • [(clear R) (ontable R) (handempty)];

  • [(clear D) (on D B) (handempty)];

  • [(clear R)] (which is also a sub-goal);

Thus, the estimated percentage of achieved fact landmarks for the goal RED is 60%, because it has achieved 6 out of 10 fact landmarks. Note that we consider sub-goals like, such as (clear R) and (on E D), as independent fact landmarks. Although all sub-goals of a goal must be true together for achieving the goal, in our filtering method we use them separately to estimate the percentage of achieved landmarks.

By contrast, for goals BED and SAD, the observed actions allow the filtering method to conclude that, respectively, 5 out of 10 and 5 out of 11 fact landmarks have been achieved for these goals. Thus, the estimated percentage of achieved fact landmarks for the BED is 50%, and for SAD is 45%. From the evidence of fact landmarks in observations (unstack E A) and (stack E D), Figures 5, 6, and 7 show the achieved fact landmarks for the candidate goals RED, BED, and SAD. Boxes in dark gray denote fact landmarks that have been achieved in observations.

Figure 6: Fact landmarks for the word BED. Boxes in dark gray show achieved fact landmarks from the observed actions (unstack E A) and (stack E D).
Figure 7: Fact landmarks for the word SAD. Boxes in dark gray show achieved fact landmarks from the observed actions (unstack E A) and (stack E D).

Example 4 does not show the real impact of using the set of partition facts (Section 3.2) in our filtering method. However, we argue that the evidence of such partitions in observations can immediately prune impossible candidate goals, avoiding the computation of achieved landmarks for such goal, improving the recognition time. We also show in Section 5 that our filtering method can be used with other planning-based goal and plan recognition approaches [Ramírez  GeffnerRamírez  Geffner2009, Ramírez  GeffnerRamírez  Geffner2010, Sohrabi, Riabov,  UdreaSohrabi et al.2016], improving significantly the recognition time for all domains and problems, by reducing the number of calls to a planner (or heuristic).

The complexity of filtering candidate goals (Algorithm LABEL:alg:filterCandidateGoals) in the worst case, including the process of extracting landmarks () and fact partitions () is: , where is the set of candidate goals, is the observation sequence, and is the extracted landmarks for . Classifying facts into partitions is a simple iteration over the set of instantiated actions .

Proposition 1 (Soundness of the Goal Filtering Algorithm).

Let be a goal recognition problem with candidate goals , a complete and noiseless observation sequence . If is the correct hidden goal, then, for any landmark extraction algorithm that generates fact landmarks and computed landmarks , function FilterCandidateGoals() never filters out for any threshold .

Proof.

The proof of this proposition depends on two conditions: first that the reasoning performed over fact partitions never discards ; and second, that the ranking using the percentage of achieved landmarks always ranks highest (with possible ties).

The first property then relies on three conditions, namely that we never discard the true goal reasoning about: Strictly Activating facts (from Definition 14), Unstable Activating facts (from Definition 15), and Strictly Terminal facts (from Definition 16). Since we only eliminate goals whose landmarks are Strictly Activating () that are not in the initial state , this condition only eliminates goals for which there is no possible plan from the initial state. By Definition 9, must correspond to a plan from that achieves , so, if any landmark of is Strictly Activating, it must be in . Similarly, we only eliminate goals whose landmarks are Unstable Activating () and Strictly Terminal () if they are part of the preconditions or effects of the observations that occur before is needed (i.e., that they have become false throughout the plan before they were needed). Since, landmarks are necessary conditions, then any valid plan from to must only delete after they are needed, and thus only goals for which observation is not a valid plan can be discarded. The second property follows from Theorem 1. ∎

As a consequence of Proposition 1, the filtering mechanism can do no worse than the goal recognition algorithm that uses the results of FilterCandidateGoals as candidate goals in full observability. Indeed the empirical results of Section 5.3 corroborate this theoretical result, as the accuracy for the filtered version of the Ramírez and Geffner (RamirezG_IJCAI2009) algorithm is strictly superior to the algorithm alone for full observability.

4.3 Landmark-Based Goal Completion Heuristic

We now describe a goal recognition heuristic that estimates the percentage of completion of a goal based on the number of landmarks that have been detected, and are required to achieve that goal [Pereira, Oren,  MeneguzziPereira et al.2017]. This estimate represents the percentage of sub-goals in a goal that have been accomplished based on the evidence of achieved fact landmarks in the observations. We note that a candidate goal is composed of sub-goals comprised of the atomic facts that are part of a conjunction of facts in the goal definition.

Our heuristic method estimates the percentage of completion towards a goal by using the set of achieved fact landmarks computed by Algorithm 1 (ComputeAchievedLandmarks). More specifically, this heuristic operates by aggregating the percentage of completion of each sub-goal into an overall percentage of completion for all facts of a candidate goal. We denote this heuristic as , and it is formally defined by Equation 1, where is the number of achieved landmarks from observations of every sub-goal of the candidate goal in , and represents the number of necessary landmarks to achieve every sub-goal of in .

(1)

Thus, heuristic estimates the completion of a goal by calculating the ratio between the sum of the percentage of completion for every sub-goal , i.e., , and the size of the set of sub-goals, that is, the number of sub-goals in . Algorithm 2 describes how to recognize goals using the heuristic and takes as input a goal recognition problem , as well as a threshold value . The threshold gives us flexibility to avoid eliminating candidate goals whose the percentage of goal completion are close to the highest completion value. In Line 2, the algorithm uses the ExtractLandmarks function to extract fact landmarks for all candidate goals. By taking as input the initial state , the observations , and the extracted landmarks , in Line 3, our algorithm first computes the set of achieved landmarks for every candidate goal using Algorithm 1. Finally, the algorithm uses the heuristic to estimate goal completion for every candidate in , and as output (Line 5), the algorithm returns those candidate goals with the highest estimated value within the threshold . Example 5 shows how heuristic estimates the completion of a candidate goal.

Input: planning domain definition, initial state, set of candidate goals, observations, and threshold.
Output: Recognized goal(s).

1:function Recognize()
2:      ExtractLandmarks()
3:      ComputeAchievedLandmarks()
4:     
5:     return all s.t and
return
6:end function
Algorithm 2 Recognize Goals using the Goal Completion Heuristic .
Example 5.

As an example of how heuristic estimates goal completion of a candidate goal, recall the Blocks-World example from Figure 2. Consider that among these candidate goals (RED, BED, and SAD) the correct hidden goal is RED, and we observe the following partial sequence of actions: (unstack E A) and (stack E D). Thus, based on the achieved landmarks computed using Algorithm 1 (Figure 5), our heuristic estimates that the percentage of completion for the goal RED is 0.66: (clear R) = (on E D) = (on R E) = (ontable D) = , and hence, = 0.66. For the words BED and SAD our heuristic estimates respectively, 0.54 and 0.58.

Besides extracting landmarks for every candidate goal (), our landmark-based goal completion approach iterates over the set of candidate goals , the observations sequence , and the extracted landmarks . The heuristic computation of () is linear on the number of fact landmarks. Thus, the complexity of this approach is: . Finally, the goal ranking based on always ensures (in full observability) that the correct goal ranks highest (i.e., it is sound), with possible ties, as stated in Theorem 1.

Theorem 1 (Soundness of the Goal Recognition Heuristic).

Let be a goal recognition problem with candidate goals such that , a complete and noiseless observation sequence . If is the correct hidden goal, then, for any landmark extraction algorithm that generates fact landmarks and computed landmarks , the estimated value of will always be highest for the correct hidden goal , i.e., it is the case that .

Proof.

The proof is straightforward from the definition of fact landmarks ensuring they are necessary conditions to achieve a goal and that all facts are necessary. Let us first assume that any pair of goals are different, i.e., , and that no action in the domain achieves facts that are in any pair of goals simultaneously. Since any landmark extraction algorithm includes all facts as landmarks for a goal , then, for every other goal , there exists at least one fact such that that sets it apart from . Under these circumstances, an observation sequence for the correct goal will have achieved a set of landmarks that is exactly the same as the complete computed set of landmarks for . Hence , and for any other goal , since the numerator of the computation will be missing fact for as is not a landmark of . If we drop the assumption about the actions not achieving facts simultaneously in any pair of goals or that goals are identical, it is possible that , which still ensures that the under , always ranks at the top, possibly tied with other goals. ∎

Thus, our goal completion heuristic is sound under full observability in the sense that it can never rank the wrong goal higher than the correct goal when we observe the landmarks. We note that there is one specific case when our landmark approach can provide wrong rankings, but which we explicitly exclude from the theorem, which is when the set of candidate goals contains two goals such that one is a sub-goal of the other (i.e., ). In this case, any kind of “distance” to goal metric will report as being more likely than until the observations take the observed agent past and closer to than . We close this section by commenting on the effect of landmark orderings in the accuracy of the heuristic. Specifically, although we do use the landmark order to infer the achievement of necessary prior landmarks that were not observed in partially observable environments, our heuristic itself does not consider the actual ordering of the heuristics. We infer prior landmarks to obtain more landmarks when we deal with partial observability. Nevertheless, we have experimented with different scoring mechanisms to account for landmarks having been observed in the expected order or not, and these showed almost no advantage over the current heuristic. Consequently, although there are various different algorithms that generate better landmark orderings [Hoffmann, Porteous,  SebastiaHoffmann et al.2004], the way in which we use the landmarks does not seem to be affected by more or less accurate landmark orderings.

There are two additional properties provable for our heuristic, first, given how our heuristic accounts for landmarks, the value of this heuristic is strictly increasing.

Proposition 2 (Monotonicity of ).

The value of is monotonically (non-strictly) increasing in the observation sequence.

Proof.

By definition, is monotonically increasing, while all other values in remain constant. Therefore from Equation 1, it is clear that must increase. ∎

Further, a corollary of Theorem 1 is that, under full observation, only the correct goal can reach a heuristic value of . This also illustrates why we restrict our theorems to settings where candidate goals are not subgoals of each other. Consider a goal to be at position , and another to be at position , with landmarks , since itself is a landmark of , is implicitly a subgoal of . If we observe all landmarks in an observation, then , and , which leads to Corollary 1.

Corollary 1.

If the goal being monitored has no subgoals being monitored under full observability, then iff the goal the heuristic is monitoring has been achieved.

Proof.

when , which can only occur when for all . This clearly occurs when the goal being monitored is achieved. However, if the heuristic is also monitoring a subgoal, then this condition can be satisfied for the subgoal, hence the exception in the proposition. ∎

4.4 Landmark-Based Uniqueness Heuristic

Many goal recognition problems contain multiple candidate goals that share common fact landmarks, generating ambiguity for our previous approaches. Clearly, landmarks that are common to multiple candidate goals are less useful for recognizing a goal than landmarks that exist for only a single goal. As a consequence, computing how unique (and thus informative) each landmark is can help disambiguate similar goals for a set of candidate goals. We now develop a second goal recognition heuristic based on this intuition. To develop this heuristic, we introduce the concept of landmark uniqueness, which is the inverse frequency of a landmark among the landmarks found in a set of candidate goals [Pereira, Oren,  MeneguzziPereira et al.2017]. For example, consider a landmark that occurs only for a single goal within a set of candidate goals; the uniqueness value for such a landmark is intuitively the maximum value of 1. Equation 2 formalizes this intuition, describing how the landmark uniqueness value is computed for a landmark and a set of landmarks for goals .

Using the landmark uniqueness value, we estimate which candidate goal is the intended one by summing the uniqueness values of the landmarks achieved in the observations. Unlike our previous heuristic, which estimates progress towards goal completion by analyzing sub-goals and their achieved landmarks, the landmark-based uniqueness heuristic estimates the goal completion of a candidate goal by calculating the ratio between the sum of the uniqueness value of the achieved landmarks of and the sum of the uniqueness value of all landmarks of . This algorithm effectively weighs the completion value by the informational value of a landmark so that unique landmarks have the highest weight. To estimate goal completion using the landmark uniqueness value, we calculate the uniqueness value for every extracted landmark in the set of landmarks of the candidate goals using Equation 2. This computes the landmark uniqueness value of every landmark of and store it into . This heuristic is denoted as and formally defined in Equation 3.

(2)
(3)

Algorithm 3 formalizes a goal recognition function that uses the heuristic. This algorithm takes as input the same parameters as the previous approach: a goal recognition problem and a threshold . Like Algorithm 1, this algorithm extracts the set of landmarks for all candidate goals from the initial state , stores them in (Line 2), and computes the set of achieved landmarks based on the observations, storing these in . Unlike Algorithm 2, in Line 6 this algorithm computes the landmark uniqueness value for every landmark in and stores it into . Finally, using these computed structures, the algorithm recognizes which candidate goal is being pursued from observations using the heuristic , returning those candidate goals with the highest estimated value within the threshold. Example 6 shows how heuristic uses the concept of landmark uniqueness value to goal recognition.

Input: planning domain definition, initial state, set of candidate goals, observations, and threshold.
Output: Recognized goal(s).

1:function Recognize()
2:      ExtractLandmarks()
3:      ComputeAchievedLandmarks()
4:      Map of landmarks to their uniqueness value.
5:     for each fact landmark in  do
6:         
7:     end for
8:     
9:     return all s.t and
return
10:end function
Algorithm 3 Recognize Goals using the Landmark Uniqueness Heuristic .
Example 6.

Recall the Blocks-World example from Figure 2 consider the following observed actions: (unstack E A) and (stack E D). Listing 2 shows the set of extracted fact landmarks for the candidate goals in the Blocks-World example and their respective uniqueness value. Based on the set of achieved landmarks (shown in bold in Listing 2), our heuristic estimates the following percentage for each candidate goal: (RED) = = 0.58; (BED) = = 0.42; and (SAD) = = 0.44. In this case, Algorithm 3 correctly estimates RED to be the intended goal since it has the highest heuristic value.

- (and (clear B) (on B E) (on E D) (ontable D)) = 6.33
  [(on E D)] = 0.5, [(clear D) (holding E)] = 0.5,
  [(on E A) (clear E) (handempty)] = 0.33, [(ontable D)] = 0.33,
  [(on D B) (clear D) (handempty)] = 0.33, [(holding D)] = 0.33,
  [(clear B) (ontable B) (handempty)] = 1.0, [(on B E)] = 1.0,
  [(clear B)] = 1.0, [(clear E) (holding B)] = 1.0
- (and (clear S) (on S A) (on A D) (ontable D)) = 8.33
  [(clear S)] = 1.0, [(on A D)] = 1.0, [(on S A)] = 1.0,
  [(clear A) (ontable A) (handempty)] = 1.0, [(ontable D)] = 0.33,
  [(clear S) (ontable S) (handempty)] = 1.0, [(holding D)] = 0.33,
  [(on E A) (clear E) (handempty)] = 0.33,
  [(on D B) (clear D) (handempty)] = 0.33,
  [(clear A) (holding S)] = 1.0, [(clear D) (holding A)] = 1.0
- (and (clear R) (on R E) (on E D) (ontable D)) = 6.33
  [(clear R)] = 1.0, [(clear R) (ontable R) (handempty)] = 1.0,
  [(clear D) (holding E)] = 0.5, [(on E D)] = 0.5,
  [(on E A) (clear E) (handempty)] = 0.33, [(ontable D)] = 0.33,
  [(on D B) (clear D) (handempty)] = 0.33, [(holding D)] = 0.33,
  [(on R E)] = 1.0, [(clear E) (holding R)] = 1.
Listing 2: Extracted fact landmarks for the Blocks-World example in Figure 2 and their respective uniqueness value.

Similar to our landmark-based goal completion approach, this approach iterates over the set of candidate goals , the observations sequence , and the extracted landmarks . However, for this approach we compute the uniqueness value () for every extracted landmarks, which is linear on the number of landmarks. The heuristic computation of () is also linear on the number of fact landmarks. Thus, the complexity of this approach is: . Finally, since this is just a weighted version of the heuristic, it follows trivially from Theorem 1 that, for full observations, always ranks the correct goal highest.

Corollary 2 (Correctness of Goal Recognition Heuristic).

Let be a goal recognition problem with candidate goals , a complete and noiseless observation sequence . If is the correct goal, then, for any landmark extraction algorithm that generates fact landmarks and computed landmarks , the estimated value of will always be highest for the correct goal , i.e., .

5 Experiments and Evaluation

In this section, we describe the experiments and evaluation we carried out on our goal recognition approaches. We start with a description of the planning domains and the datasets, as well as the metrics we use to evaluate our approaches in Section 5.2. Section 5.3 then details the experiments and evaluation results of our goal recognition approaches.

5.1 Domains, Datasets, and Metrics

We empirically evaluated our approaches using 15 domains from the planning literature666http://ipc.icaps-conference.org. Six of these domains are also used in the evaluation of other goal and plan recognition approaches [Ramírez  GeffnerRamírez  Geffner2009, Ramírez  GeffnerRamírez  Geffner2010]777https://sites.google.com/site/prasplanning/. We summarize these domains as follows.

  • Blocks-World is a domain that consists of a set of blocks, a table, and a robot hand. Blocks can be stacked on top of other blocks or on the table. A block that has nothing on it is clear. The robot hand can hold one block or be empty. The goal is to find a sequence of actions that achieves a final configuration of blocks;

  • Campus is a domain that consists of finding what activity is being performed by a student from his observations on a campus environment;

  • Depots is a domain that combines transportation and stacking. For transportation, packages can be moved between depots by loading them on trucks. For stacking, hoists can stack packages on palettes or other packages. The goal is to move and stack packages by using trucks and hoists between depots;

  • Driver-Log is a domain that consists of drivers that can walk between locations and trucks that can drive between locations. Walking between locations requires traversal of different paths. Trucks can be loaded with or unloaded of packages. Goals in this domain consists of transporting packages between locations;

  • Dock-Worker-Robots (DWR) is a domain that involves a number of cranes, locations, robots, containers, and piles, in which goals involve transporting containers to a final destination according to a desired order;

  • IPC-Grid domain is a domain consists of an agent that moves in a grid from connected cells to others by transporting keys in order to open locked locations;

  • Ferry is a domain that consists of set of cars that must be moved to desired locations using a ferry that can carry only one car at a time;

  • Intrusion-Detection represents a domain where a hacker tries to access, vandalize, steal information, or perform a combination of these attacks on a set of servers;

  • Kitchen is a domain that consists of home-activities, in which the goals can be preparing dinner, breakfast, among others;

  • Logistics is a domain which models cities, and each city contains locations. These locations are airports. For transporting packages between locations, there are two types o vehicles: trucks and airplanes. Trucks can drive between cities. Airplanes can fly between airports. The goal is to get and transport packages from locations to other locations;

  • Miconic is a domain that involves transporting a number of passengers using an elevator to reach destination floors;

  • Rovers is a domain that consists of a set of rovers that navigate on a planet surface in order to find samples and communicate experiments;

  • Satellite is a domain that involves using one or more satellites to make observations, by collecting data and down-linking the data to a desired ground station;

  • Sokoban is a domain that involves an agent whose goal is to push a set of boxes into specified goal locations in a grid with walls; and

  • Zeno-Travel is a domain where passengers can embark and disembark onto aircraft that can fly at two alternative speeds between locations.

We formalize planning domains and problems using the STRIPS [Fikes  NilssonFikes  Nilsson1971] fragment of PDDL [McDermott, Ghallab, Howe, Knoblock, Ram, Veloso, Weld,  WilkinsMcDermott et al.1998]. Based on the datasets provided by Ramírez and Geffner (RamirezG_IJCAI2009,RamirezG_AAAI2010), which contain hundreds of goal recognition problems for 6 domains, we added non-trivial888A non-trivial planning problem contains a large search space (in terms of search branching factor and depth), and therefore, modern planners such as Fast-Downward takes up to 5-minutes to solve it. In our datasets, the number of instantiated (grounded) actions is between 158 and 3258, and plan length is between 5 and 83. and larger planning problems in their datasets and generated new datasets999https://github.com/pucrs-automated-planning/goal_plan-recognition-dataset from the remaining 9 planning domains using open-source planners, such as Fast-Downward [HelmertHelmert2011], Fast-Forward [Hoffmann  NebelHoffmann  Nebel2001], and LAMA RichterLPG_2010, each of which is based on planning problems containing both optimal and sub-optimal plans of various sizes, including large problems to test the scalability of the approaches [Pereira  MeneguzziPereira  Meneguzzi2017]. We also generated datasets for 4 domains (Campus, Intrusion, IPC-Grid, and Kitchen) with missing, full, and noisy observations, which are the same domains that Sohrabi et al. use in (Sohrabi_IJCAI2016). The dataset for Campus domain with missing and noisy observations comes from Sohrabi et al. (Sohrabi_IJCAI2016)101010https://github.com/shirin888/planrecogasplanning-ijcai16-benchmarks. Thus, we evaluate our goal recognition approaches against the state-of-the-art not only using datasets with missing and full observations, but also using datasets with noisy observations in the same way as Sohrabi et al. (Sohrabi_IJCAI2016).

5.2 Evaluation Metrics

For evaluation of our goal recognition approaches against the state-of-the-art, we use the Accuracy metric (Equation 4), which represents the fraction of times that the correct goal was among the goals found to be most likely, i.e., how well the correct hidden goal is recognized from a set of possible goals for a given goal recognition problem. Most goal recognition approaches [Ramírez  GeffnerRamírez  Geffner2010, E-Martín, R.-Moreno,  SmithE-Martín et al.2015, Sohrabi, Riabov,  UdreaSohrabi et al.2016] refer to this metric as quality, also denoted as Q.

(4)

Like most goal and plan recognition approaches in the literature, we also evaluate the average number of returned goals, which is called as Spread in , and recognition time (speed), in seconds, representing the time that a goal recognition approach takes to recognize the most likely goal from a set of possible goals.

5.3 Goal Recognition Experimental Results

Our experiments compare our goal recognition approaches and heuristics ( and ) to four other approaches. First, we use the heuristic estimator approach of Ramírez and Geffner (RamirezG_IJCAI2009)111111Ramírez and Geffner (RamirezG_IJCAI2009) developed a goal and plan recognition approach which uses a heuristic method to approximate the planning solution by computing a relaxed plan [Keyder  GeffnerKeyder  Geffner2008]. The authors show that this heuristic-based approach is their faster and more accurate goal and plan recognition approach. , denoted as R&G 2009; as well as a combination of their approach and our filtering method with threshold 10%, denoted as Filter R&G 2009. This is their fastest and most accurate algorithm121212https://sites.google.com/site/prasplanning/file-cabinet/plan-recognition.tar.bz2. Second, we use the probabilistic framework of Ramírez and Geffner (RamirezG_AAAI2010)131313https://sites.google.com/site/prasplanning/file-cabinet/prob-plan-recognition.tar.bz2 that allows the use of any off-the-shelf automated planner, denoted as R&G 2010. The automated planner we used alongside this approach is Fast-Downward [HelmertHelmert2011] with the LM-Cut heuristic [Helmert  DomshlakHelmert  Domshlak2009], a planning heuristic that relies on landmarks to estimate the distance from a particular state to a goal state. We also use this approach with our filtering method (threshold 10%), denoted as Filter R&G 2010. Third, we use the approach of Sohrabi et al. (Sohrabi_IJCAI2016)141414Since the exact code from Sohrabi et al. (Sohrabi_IJCAI2016) is unavailable, we developed our own version of this approach with some advice from the main author and the top-k planner she shared., which uses a top-K planner to extract multiple optimal and nearly optimal plans for a particular goal, denoted as IBM 2016151515We ran experiments using a top-k planner rather than a diverse planner under advice from the main author.. The automated planner we used alongside this approach is the most modern top-K planner TK [Katz, Sohrabi, Udrea,  WintererKatz et al.2018] with the LM-Cut heuristic, and the number of sampled plans parameter K=1000. These are exactly the same parameters that Sohrabi et al. used in the experiments and evaluation in [Sohrabi, Riabov,  UdreaSohrabi et al.2016]. Note that we use the LM-Cut heuristic [Helmert  DomshlakHelmert  Domshlak2009] with the approaches from [Ramírez  GeffnerRamírez  Geffner2010] and [Sohrabi, Riabov,  UdreaSohrabi et al.2016] because our goal recognition approaches proposed rely on landmarks. The use of this landmark-based planning heuristic with the planners Fast-Downward [HelmertHelmert2011] and top-K planner TK [Katz, Sohrabi, Udrea,  WintererKatz et al.2018] aims to provide a fairer comparison against our landmark-based approaches. Finally, we compare our approaches against the approach of E-Martín et al. (NASA_GoalRecognition_IJCAI2015), denoted as FGR 2015. Their goal recognition approach also obviates the use of calling a planner multiple times for the recognition process, and instead, uses a planning graph, resulting in fast goal recognition in planning settings. As advised by the authors, we use the FGR 2015 recognizer with interaction information equals to 0, which is their technique that yields the best results in terms of recognition time and accuracy161616In an attempt to make a fair comparison, we obtained the code for the algorithms of [E-Martín, R.-Moreno,  SmithE-Martín et al.2015] directly from the main author. Running on our datasets, these algorithms performed worse than the results of [E-Martín, R.-Moreno,  SmithE-Martín et al.2015]. We believe that this could be due to different problem set sizes of our datasets. In addition, the code behaved unexpectedly on some domains of our datasets (denoted by a symbol in the tables), returning the same recognition score for all candidate goals. At the time of submission we are working with the authors to clarify these discrepancies.. In both non-noisy (missing) and noisy domains, the IPC-Grid domain timed out for more than 60% of the problems in the FGR 2015 approach, so we do not report results for this specific domain as they would not be representative.

These approaches take as input a goal recognition problem (from Definition 9), i.e., a domain description as well as an initial state, a set of candidate goals , a correct hidden goal in , and an observation sequence . An observation sequence contains actions that represent an optimal plan or sub-optimal plan that achieves a correct hidden goal , and this observation sequence can be full or partial. Full observation sequences contain the entire plan for a correct hidden goal , i.e., 100% of the actions having been observed. Partial observation sequences represent plans for a correct hidden goal with 10%, 30%, 50%, or 70% of their actions having been observed. However, for experiments with noisy observations, the observability of partial observations is quite different because every observation sequence always includes at least two noisy observations, so a partial observation sequence with noisy observations represents a plan with 25%, 50%, or 75% of its actions having been observed.

Our evaluation uses three metrics: accuracy of goal recognition (Equation 4), the average number of goals in that have been found to be the most likely, and recognition time (speed). Note that in many domains, all algorithms return more than one candidate goal. In the case of Ramírez and Geffner (RamirezG_IJCAI2009), i.e., R&G 2009, this may occur when goals have the same distance from their estimated state. Alternatively, for our goal recognition heuristics, this may occur when there are ties between the heuristic value of candidate goals within the threshold margin. Thus, like most goal recognition approaches, we also evaluate the average number of returned goals for a given goal recognition problem, i.e., the Spread in . We ran all experiments using a single core of a 12 core Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz with 16GB of RAM, set a maximum memory usage limit of 4GB, and set a 20-minute timeout for each recognition process.

5.3.1 Experimental Results with Missing and Full Observations

Our first set of experiments consists of running the various goal recognition algorithms in datasets containing thousands of problems for 15 domains with missing and full observations. In what follows, each table shows the total number of goal recognition problems used under each domain name (first column). Each row in the tables express averages for the number of candidate goals ; the percentage of the plan that is actually observed % Obs; the average number of observations (actions) per problem ; and for each approach, the time in seconds to recognize the goal given the observations (Time); the Accuracy with which the approaches correctly infer the goal; and Spread in represents the average number of returned goals. For our goal recognition heuristics and , we show their results under different thresholds: 0%, 10%, and 20%. If the threshold value is , our approaches do not give any flexibility estimating candidate goals, returning only the goals with the highest estimated value. Tables LABEL:tab:goalRecognitionResults1_1 and LABEL:tab:goalRecognitionResults1_2 show comparative results of our heuristics and previous approaches for the first set of domains (Blocks-World to Intrusion). Table LABEL:tab:goalRecognitionResults1_1 shows the results of our goal recognition heuristics and , against R&G 2009 [Ramírez  GeffnerRamírez  Geffner2009] as well as this approach enhanced with our filtering method with threshold (Filter). Similarly, Table LABEL:tab:goalRecognitionResults1_2 shows the results of R&G 2010 [Ramírez  GeffnerRamírez  Geffner2010], FGR 2015 [E-Martín, R.-Moreno,  SmithE-Martín et al.2015], and IBM 2016 [Sohrabi, Riabov,  UdreaSohrabi et al.2016]. We show both approaches of R&G 2010 [Ramírez  GeffnerRamírez  Geffner2010] and IBM 2016 [Sohrabi, Riabov,  UdreaSohrabi et al.2016] individually as well as enhanced with our filtering method (again with ). Tables LABEL:tab:goalRecognitionResults2_1, and LABEL:tab:goalRecognitionResults2_2 show comparative results of our heuristics and previous approaches for the second set of domains (Kitchen to Zeno-Travel). From these tables, it is possible to see that our landmark-based approaches are both faster and more accurate than R&G 2009, R&G 2010, FGR 2015, and IBM 2016, and, when we combine their algorithms with our filtering method, the resulting approaches get a substantial speedup and often accuracy improvements. As we increase the threshold, our heuristic approaches quickly surpass the other approaches in all domains tested. Note that we report the accuracy averaged over all of the problems for each observability. For example, in Table LABEL:tab:goalRecognitionResults1_2, for the Campus domain, there are problems with observability (totaling for the entire domain), and the IBM 2016 [Sohrabi, Riabov,  UdreaSohrabi et al.2016] includes this goal in its output for out of , resulting in accuracy.

The Receiver Operating Characteristic (ROC) curve allows us to provide a summary of the discriminatory performance of inferences such as goal recognition over diverse datasets. The ROC curve shows graphically the performance of classifier systems by evaluating true positive rate against the false positive rate at various threshold settings. We adapt the notion of the ROC curve into points over the ROC space to compare not only true positive predictions (i.e., Accuracy), but also to compare the false positive ratio of the experimented goals recognition approaches. Each prediction result of a goal recognition approach represents one point in the space. In ROC space, the diagonal line represents a random guess to recognize a goal from observations. This diagonal line divides the ROC space, in which points above the diagonal represent good classification results (better than random), whereas points below the line represent poor results (worse than random). The best possible (perfect) prediction for recognizing goals must be a point in the upper left corner (i.e., coordinate x = 0 and y = 100) in the ROC space. Thus, the closer a goal recognition approach (point) gets to the upper left corner, the better it is for recognizing goals. To visualize the comparative performance of the multiple approaches, we adapt the notation of the ROC space, and, rather than plotting a single point per goal recognition problem, we aggregate multiple problems for all domains and plot these results in ROC space.

Figure 8 shows the trade-off between true positive results and false positive results in ROC space for the evaluated goal and plan recognition approaches. Recall that the closer a goal recognition approach (point) is to the upper left corner, the better it is for recognizing goals and plans. To compare the recognition results of our approaches against the others in the ROC space, we select the results of our heuristics using the threshold = 20%. For each approach, we plot its recognition results for all domains into a cloud of points, which represents (in general) how well each approach recognizes the correct hidden goal from missing and full observations. Thus, the points in ROC space show that our heuristics are not only competitive against the four other approaches (R&G 2009, R&G 2010, FGR 2015, and IBM 2016) for all variations of observability, but also surpasses the approaches in a substantial number of domains.

Figure 8: ROC space for all domains with missing and full observations for our landmark-based heuristics ( and ) against R&G 2009 [Ramírez  GeffnerRamírez  Geffner2009], R&G 2010 [Ramírez  GeffnerRamírez  Geffner2010], and FGR 2015 [E-Martín, R.-Moreno,  SmithE-Martín et al.2015]. The results obtained for Campus domain from IBM 2016 [Sohrabi, Riabov,  UdreaSohrabi et al.2016] are not included in the ROC space.

With respect to recognition time, we compare the time that each approach takes to recognize the correct hidden goal for different sizes of the observation sequence. Figures 9 and 10 show the runtime as a function of the average length of the observation sequences ( column in the Tables) for all of the approaches we evaluated (apart from IBM 2016, which timed out for almost all domains), as reported in the time column of Tables LABEL:tab:goalRecognitionResults1_1, LABEL:tab:goalRecognitionResults1_2, LABEL:tab:goalRecognitionResults2_1, and LABEL:tab:goalRecognitionResults2_2). Figure 9 shows the runtime for our heuristic approaches in comparison with R&G 2009 and this approach alongside our filtering method, whereas Figure 10 shows the runtime of R&G 2010 with and without the filtering method, and FGR 2015. We used separate graphs for these techniques, given the widely different magnitude of the time taken to recognize a goal. When measuring recognition time for our heuristics and the filtering method, we also include the time to extract the set of landmarks, so that landmark extraction is performed online, i.e., during the goal recognition process. Curves in the graph represent the average runtime when observation sizes were the same smoothed over the resulting points. The graph shows the scalability of the 4 evaluated approaches. Our goal recognition heuristics never take more than 1 second ( seconds) to compute the correct hidden goal in the set of candidate goals, while the other approaches seem to grow super-linearly (for R&G 2009), and exponentially (for R&G 2010). The approaches of Ramírez and Geffner (RamirezG_IJCAI2009,RamirezG_AAAI2010), R&G 2009 and R&G 2010, took at most seconds and seconds, respectively. Apart from the Campus domain, the approach of IBM 2016 [Sohrabi, Riabov,  UdreaSohrabi et al.2016] timed out for all goal recognition problems (all domains) in the datasets we used, probably because the top-K planner [Katz, Sohrabi, Udrea,  WintererKatz et al.2018] (even the latest top-K planning algorithm) does not scale very well when dealing with non-trivial planning problems, especially when the planner has to sample plans for a (transformed) planning problem. In this case, even the use of our filtering method (which reduces the number of candidate goals) did not improve the recognition time of the approach from IBM 2016 [Sohrabi, Riabov,  UdreaSohrabi et al.2016]. While our filtering method significantly improves the recognition time of the approaches R&G 2009 and R&G 2010, it sometimes causes a loss of accuracy due to it ruling out the correct hidden from the set of candidate goals. FGR 2015 took at most seconds over all evaluated domains (and timed out for most problems of the IPC-Grid domain). FGR 2015 is much faster than R&G 2010 and R&G 2010 with our filtering method, though not as fast as our recognition heuristics and R&G 2009. Finally, the evaluation of the domains DWR and Sokoban shows that larger plan lengths lead R&G 2009 and R&G 2010 to rapidly lose accuracy, whereas our approaches show improved accuracy without affecting the recognition time.

As Tables LABEL:tab:goalRecognitionResults1_1, LABEL:tab:goalRecognitionResults1_2, LABEL:tab:goalRecognitionResults2_1, and LABEL:tab:goalRecognitionResults2_2 show, our goal recognition heuristics are not only competitive (using thresholds between and ) against the other approaches with superior accuracy, but also at least an order of magnitude faster (for all evaluated domains), for example, 2900 times faster than R&G 2010 in DWR domain. Comparison with two other state-of-the-art recent techniques, we can also see that IBM 2016 is substantially slower, even compared to R&G, whereas FGR 2015, while consistently much faster than R&G 2010, is also slower than our heuristics techniques (up to an order of magnitude) across the board. When comparing with our heuristics, the results show that the goal completion heuristic is often more accurate than the uniqueness heuristic . However, returns fewer candidate goals (Spread in ) than the goal completion heuristic as a result of the landmark uniqueness value, which weights landmark information among all landmarks for all goals, making more precise (but sometimes less accurate) than the goal completion heuristic . We use the threshold value to provide flexibility when the heuristic approaches fail to observe landmarks. While our approach is more accurate than virtually all other approaches with a recognition threshold value of of optimal (sometimes with larger spread), the comparison becomes more complex for other thresholds. The only domain in which the FGR 2015 approach is more accurate than ours is DWR with with observability under , however, the spread in is nearly twice as large as ours, meaning that FGR 2015 is worse at disambiguating goals. Apart from the Campus and Kitchen domains, our approaches have similar or worse accuracy at very low ( or less) observability. This loss of accuracy happens for low observability problems because the number of landmarks that happen to be observed is much lower (as the likelihood of observing a landmark goes down) creating a challenge to disambiguate and recognize the correct hidden goal. The results for Campus and Kitchen are explained by the reduced number of goal hypotheses in each domain and the informativeness of the actions, which yield landmarks that favor our approaches. For domains such as DWR, Depots, Sokoban, Zeno-Travel, which are considered more complex because traditional planning heuristics are not very informative for them, our results are mixed. Sometimes, we are able to achieve high accuracy with low observability (albeit with high spread in DWR and Depots), whereas sometimes we achieve lower accuracy with low spread for Sokoban and Zeno-Travel. In this particular setting, Sokoban is known to be a particularly difficult domain for planning heuristics [Pereira, Ritt,  BuriolPereira et al.2015], and yields a small number of landmarks per goal. Nevertheless, when our heuristic approaches deal with more than of observability the results are very good both in Accuracy and Spread in for all domains.

Figure 9: Recognition time comparison for missing and full observations for our landmark-based heuristics ( and ) against R&G 2009 [Ramírez  GeffnerRamírez  Geffner2009], and R&G 2009 using our filtering method with 10% of threshold.
Figure 10: Recognition time comparison for missing and full observations for R&G 2010 using Fast-Downward with LM-Cut heuristic [Ramírez  GeffnerRamírez  Geffner2010], R&G 2010 using our filtering method with 10% of threshold, and FGR 2015 [E-Martín, R.-Moreno,  SmithE-Martín et al.2015].

5.3.2 Experimental Results with Missing, Noisy, and Full Observations

Our second set of experiments uses datasets containing hundreds of problems for four domains with missing, noisy, and full observations. Tables LABEL:tab:goalRecognitionResultsWithNoisy1 and LABEL:tab:goalRecognitionResultsWithNoisy2 compare results for the experiments with missing, noisy, and full observations for our goal recognition heuristics (using a threshold between 0% and 10%) against R&G 2009 [Ramírez  GeffnerRamírez  Geffner2009], R&G 2010 [Ramírez  GeffnerRamírez  Geffner2010], FGR 2015 [E-Martín, R.-Moreno,  SmithE-Martín et al.2015], and Sohrabi_IJCAI2016 (Sohrabi_IJCAI2016), denoted as IBM 2016. We use our filtering method with 10% of threshold alongside all these three approaches, denoted as Filter. For this set of experiments, we used the same 4 domains used by Sohrabi_IJCAI2016 (Sohrabi_IJCAI2016). In these experiments, column represents the average number of noisy observations, i.e., extra observations that we added randomly to the observation sequence . These two extra observations represent 12% of noise regarding the total number of observations [Sohrabi, Riabov,  UdreaSohrabi et al.2016]. Since Sohrabi_IJCAI2016 (Sohrabi_IJCAI2016) timed out for all recognition problems, we are unable to provide a direct comparison of accuracy and runtime performance against this approach. However, given our understanding of the underlying technique, we believe that our approaches are almost certainly more computationally efficient, since they use a top-K planner [Katz, Sohrabi, Udrea,  WintererKatz et al.2018] during the recognition process (extracting 1000 sampled plans), much like [Ramírez  GeffnerRamírez  Geffner2010]. The FGR 2015 approach is closer to ours in runtime performance for noisy observations, but still two to ten times slower.

Under noisy observations, it is clear from the results in Tables LABEL:tab:goalRecognitionResultsWithNoisy1 and LABEL:tab:goalRecognitionResultsWithNoisy2 that the approaches R&G 2009 and R&G 2010 are not only much slower but substantially less accurate (with threshold value of 10%) than our heuristics for virtually all 4 domains, reaching a low of 4.4 and 3.3 percent (respectively) of accuracy in the IPC-Grid domain. However, using the recognition threshold = 0%, the R&G 2009 and R&G 2010 approaches are more accurate than our heuristics for two particular domains, more specifically, for Intrusion and Campus (respectively), while the FGR 2015 approach is more accurate than ours for the Intrusion domain, as well as Campus and Kitchen under some conditions (multiple noisy observations). Our uniqueness heuristic performed better (more accurate and faster) than the goal completion heuristic for all 4 domains. Regarding the difference in accuracy for and in the Kitchen domain under low observability, note the number of useful actions actually observed (2.5, 2 of which are known to be noise) at that observability level. This means that on average, in this domain, each experiment will have seen mostly noise and possibly one or two actions, or at times, no non-noisy action. Under these conditions being able to get the most information out of the observation (and correctly ignoring noise) is key. Here, the analysis of propositions that are not landmarks performed by the FGR 2015 approach seems to allow coping with a substantial amount of noisy versus non-noisy observations better than our approach. Note that by increasing the threshold parameter, we increase the spread to closer values to FGR 2015 and reach similar levels of accuracy.

Figure 11 shows the trade-off between true positive results and false positive results in a ROC space for all 4 domains with missing, noisy, and full observations. Figures 12 and 13 show a comparison of recognition time for our heuristics against the approaches R&G 2009 and R&G 2010. We used separate graphs for R&G 2010, Filter R&G 2010, and FGR 2015 given the widely different magnitude of the time taken to recognize a goal.

Figure 11: ROC space for all domains with missing, noisy, and full observations for our landmark-based heuristics ( and ) against R&G 2009 [Ramírez  GeffnerRamírez  Geffner2009], R&G 2010 [Ramírez  GeffnerRamírez  Geffner2010], and FGR 2015 [E-Martín, R.-Moreno,  SmithE-Martín et al.2015].
Figure 12: Recognition time comparison for missing, noisy, and full observations for our landmark-based heuristics ( and ) against R&G 2009 [Ramírez  GeffnerRamírez  Geffner2009], and R&G 2009 using our filtering method with 10% of threshold.
Figure 13: Recognition time comparison for missing, noisy, and full observations for R&G 2010 using Fast-Downward with LM-Cut heuristic [Ramírez  GeffnerRamírez  Geffner2010], R&G 2010 using our filtering method with 10% of threshold, and FGR 2015 [E-Martín, R.-Moreno,  SmithE-Martín et al.2015].

6 Related Work

In this section, we compare our work to some of the most relevant recent work on goal and plan recognition in recent past years. We highlight differences and similarities between our goal recognition approaches and the surveyed related work.

HongGoalRecognition_2001 (HongGoalRecognition_2001) developed one of the first goal recognition approach that extends the concept of planning graph (which they call a goal graph), developing a similar structure that represents every possible path (e.g., state transitions that connect facts and actions) from an initial state to a goal state. Pattison and Long (PattisonGoalRecognition_2010) propose AUTOGRAPH (AUTOmatic Goal Recognition with A Planning Heuristic), a probabilistic heuristic-based goal recognition over planning domains. AUTOGRAPH uses heuristic estimation and domain analysis to determine which goals an agent is pursuing. Ramírez and Geffner (RamirezG_IJCAI2009) developed planning approaches for plan recognition, and instead of using plan libraries, they model the problem as a planning domain theory with respect to a known set of goals. Their work uses a heuristic, an optimal and modified sub-optimal planner to determine the distance to every goal in a set of goals after an observation. We compare their most accurate approach directly with ours. Follow-up work (RamirezG_AAAI2010) extended the idea of plan recognition as planning into a probabilistic approach using off-the-shelf planners that provide a posterior probability distribution over goals, given an observation sequence as evidence. E-Martín et al. (NASA_GoalRecognition_IJCAI2015) developed a planning-based goal recognition approach that propagates cost and interaction information in a planning graph, and uses this information to estimate goal probabilities over the set of candidate goals. Sohrabi et al. (Sohrabi_IJCAI2016) developed a probabilistic plan recognition approach that deals with unreliable observations (i.e., noisy or missing observations), and recognizes both goals and plans. Unlike these last three approaches, which provide a probabilistic interpretation of the recognition problem, we do not deal with probabilities. Nevertheless, our heuristic computation is a good proxy for the posterior probability distribution of the goals, given the observations, and thus could be extended to provide a probabilistic interpretation as we intend to do in future work. In 

[Vered, Kaminka,  BihamVered et al.2016], Mor_ACS_16 introduce the concept of mirroring to develop an online goal recognition approach for continuous domains. Masters_IJCAI2017 (Masters_IJCAI2017) propose a fast and accurate goal recognition approach for path-planning, providing a new probabilistic framework for goal recognition. In [Vered  KaminkaVered  Kaminka2017], Vered and Kaminka develop a heuristic approach for online goal recognition that deals with continuous domains. MorEtAl_AAMAS18 (MorEtAl_AAMAS18) propose an online goal recognition approach that combines the use of landmarks and goal mirroring, showing this combination can improve not only the recognition time, but also the accuracy for recognizing goals in the online fashion. Most recently, GalMor_AAAI2018 (GalMor_AAAI2018) develop a novel plan recognition approach that deals with both continuous and discrete domains.

Secondly, there has been substantial recent work on goal and plan recognition design, that is optimize the domain design so that goal and plan recognition algorithms can provide inferences with as few observations as possible. Keren et al. (GoalRecognitionDesign_Keren2014,GoalRecognitionDesign_Keren2015,GoalRecognitionDesign_Keren2016) develop an alternate view of the goal recognition problem, and rather than developing new goal recognition algorithms, they develop a novel approach that modifies the domain description in order to facilitate the goal recognition process. Their work could potentially be used alongside our techniques, and the relation between worst case distinctiveness (their measure of how difficult can it be to disambiguate goals) and the information gain from unique landmarks would provide an interesting avenue for further investigation.

Finally, [Freedman, Fung, Ganchin,  ZilbersteinFreedman et al.2018] recently proposed an approach to perform probabilistic plan recognition along the lines of [Ramírez  GeffnerRamírez  Geffner2010], that, instead of running a full-fledged planner for each goal, takes advantage of multiple goal heuristic search [Davidov  MarkovitchDavidov  Markovitch2006] to search for for all goals simultaneously and avoid repeatedly expanding the same nodes. Their approach has not been implemented and evaluated yet and it aims to overcome the limitation of our technique to only be able to account for progress towards goals when we have evidence of landmarks being achieved, while retaining the speed gains we achieve. While we do not have empirical evidence about its accuracy and efficiency, we believe this is an exciting direction for goal recognition, and we expect it to approach and overcome the accuracy of [Ramírez  GeffnerRamírez  Geffner2010].

7 Conclusions

We have developed novel goal recognition approaches based on planning techniques that rely on landmarks. Landmarks provide key information about what cannot be avoided to achieve a goal, and we have shown that they can be used efficiently, with simple heuristics, to recognize goals from missing and noisy observations. Our goal completion heuristic computes the ratio between achieved landmarks and the total number of landmarks for a particular goal, whereas our uniqueness heuristic , uses a landmark uniqueness value to represent how informative a landmark is among the known landmarks for all candidate goals. These landmark-based heuristics show that it is possible to recognize goals quickly with high accuracy as well as to use them as a filtering mechanism to refine existing planning-based goal and plan recognition approaches [Ramírez  GeffnerRamírez  Geffner2009, Ramírez  GeffnerRamírez  Geffner2010, E-Martín, R.-Moreno,  SmithE-Martín et al.2015, Sohrabi, Riabov,  UdreaSohrabi et al.2016], such that they can also be made substantially more efficient.

We have proved that our heuristic approaches are sound both as a filtering mechanism and as a goal recognition algorithm on its own, thus showing that, under certain conditions, we are guaranteed to find the correct hidden goal. Our experiments show that our goal recognition approaches yield not only superior accuracy results but also substantially faster recognition time for all fifteen planning domains used in evaluating against the state-of-the-art [Ramírez  GeffnerRamírez  Geffner2009, Ramírez  GeffnerRamírez  Geffner2010, E-Martín, R.-Moreno,  SmithE-Martín et al.2015, Sohrabi, Riabov,  UdreaSohrabi et al.2016]

at varying observation completeness levels, for both missing and noisy observations. The main limitation of our approaches lie in conditions of very low observability of 30% or less. Specifically, for problems with very short plans, and thus, where the number of actually observed action consists of one or two actions, the odds of observing one of the problem’s landmarks are very low, jeopardizing recognition accuracy. Under these conditions, our filtering mechanism still provides a major improvement on the runtime (and often accuracy) of existing goal recognition approaches.

As future work, we intend to explore multiple avenues to improve our goal recognition approaches. First, we aim to use other planning techniques, such as heuristics and symmetries in classical planning [Shleyfman, Katz, Helmert, Sievers,  WehrleShleyfman et al.2015], and traps, invariants, and dead-ends [Lipovetzky, Muise,  GeffnerLipovetzky et al.2016]. Second, we intend to explore other landmark extraction algorithms to obtain additional information from planning domains [Zhu  GivanZhu  Givan2003, Keyder, Richter,  HelmertKeyder et al.2010]. Third, we aim to evaluate our landmark-based heuristics for online goal and plan recognition, and we have started work in that direction [Vered, Pereira, Magnaguagno, Meneguzzi,  KaminkaVered et al.2018].

Acknowledgements

This article is a revised and extended version of two papers published at AAAI 2017 [Pereira, Oren,  MeneguzziPereira et al.2017] and ECAI 2016 [Pereira  MeneguzziPereira  Meneguzzi2016], we are thankful to the anonymous reviewers that helped improve the research in this article. The authors thank Shirin Sohrabi for discussing the way in which the algorithms of [Sohrabi, Riabov,  UdreaSohrabi et al.2016] should be configured, as well as Yolanda Escudero-Martín for providing the original code to the approach of [E-Martín, R.-Moreno,  SmithE-Martín et al.2015] and engaging with us on how to run it. We thank Miquel Ramírez for various discussions around the contributions of this paper and André Grahl Pereira for a discussion of properties of our algorithm. Finally, Felipe thanks CNPq for partial financial support under its PQ fellowship, grant number 305969/2016-1.

References

  • [Avrahami-Zilberbrand  KaminkaAvrahami-Zilberbrand  Kaminka2005] Avrahami-Zilberbrand, D.  Kaminka, G. A. 2005. Fast and Complete Symbolic Plan Recognition 

    In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 2005),  653–658.

  • [Blum  FurstBlum  Furst1997] Blum, A. L.  Furst, M. L. 1997. Fast Planning Through Planning Graph Analysis  Journal of Artificial Intelligence Research (JAIR), 90(1-2), 281–300.
  • [BylanderBylander1994] Bylander, T. 1994. The Computational Complexity of Propositional STRIPS Planning  Journal of Artificial Intelligence Research (JAIR), 69, 165–204.
  • [Davidov  MarkovitchDavidov  Markovitch2006] Davidov, D.  Markovitch, S. 2006. Multiple-goal heuristic search  Journal of Artificial Intelligence Research, 26, 417–451.
  • [E-Martín, R.-Moreno,  SmithE-Martín et al.2015] E-Martín, Y., R.-Moreno, M. D.,  Smith, D. E. 2015. A Fast Goal Recognition Technique Based on Interaction Estimates  In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 2015),  761–768.
  • [Fikes  NilssonFikes  Nilsson1971] Fikes, R. E.  Nilsson, N. J. 1971. STRIPS: A new approach to the application of theorem proving to problem solving  Journal of Artificial Intelligence Research (JAIR), 2(3), 189–208.
  • [Freedman, Fung, Ganchin,  ZilbersteinFreedman et al.2018] Freedman, R. G., Fung, Y. R., Ganchin, R.,  Zilberstein, S. 2018. Towards quicker probabilistic recognition with multiple goal heuristic search  In The AAAI 2018 Workshop on Plan, Activity, and Intent Recognition.
  • [GeibGeib2002] Geib, C. W. 2002. Problems with Intent Recognition for Elder Care  In Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence (AAAI 2002),  13–17.
  • [Geib  GoldmanGeib  Goldman2001] Geib, C. W.  Goldman, R. P. 2001. Plan Recognition in Intrusion Detection Systems  In In DARPA Information Survivability Conference and Exposition (DISCEX).
  • [Geib  GoldmanGeib  Goldman2005] Geib, C. W.  Goldman, R. P. 2005. Partial Observability and Probabilistic Plan/Goal Recognition  In In Proceedings of the 2005 International Workshop on Modeling Others from Observations (MOO-2005).
  • [Geib  GoldmanGeib  Goldman2009] Geib, C. W.  Goldman, R. P. 2009. A Probabilistic Plan Recognition Algorithm Based on Plan Tree Grammars  Artificial Intelligence, 173(11), 1101–1132.
  • [Ghallab, Nau,  TraversoGhallab et al.2004] Ghallab, M., Nau, D. S.,  Traverso, P. 2004. Automated Planning - Theory and Practice. Elsevier.
  • [Ghallab, Nau,  TraversoGhallab et al.2016] Ghallab, M., Nau, D. S.,  Traverso, P. 2016. Automated Planning and Acting. Elsevier.
  • [Granada, Pereira, Monteiro, Barros, Ruiz,  MeneguzziGranada et al.2017] Granada, R., Pereira, R. F., Monteiro, J., Barros, R., Ruiz, D.,  Meneguzzi, F. 2017. Hybrid Activity and Plan Recognition for Video Streams  In The AAAI 2017 Workshop on Plan, Activity, and Intent Recognition.
  • [HelmertHelmert2011] Helmert, M. 2011. The Fast Downward Planning System  Computing Research Repository (CoRR), abs/1109.6051.
  • [Helmert  DomshlakHelmert  Domshlak2009] Helmert, M.  Domshlak, C. 2009. Landmarks, Critical Paths and Abstractions: What’s the Difference Anyway?  In Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS 2009).
  • [Hoffmann  NebelHoffmann  Nebel2001] Hoffmann, J.  Nebel, B. 2001. The FF Planning System: Fast Plan Generation Through Heuristic Search  Journal of Artificial Intelligence Research (JAIR), 14(1), 253–302.
  • [Hoffmann, Porteous,  SebastiaHoffmann et al.2004] Hoffmann, J., Porteous, J.,  Sebastia, L. 2004. Ordered Landmarks in Planning  Journal of Artificial Intelligence Research (JAIR), 22(1), 215–278.
  • [HongHong2001] Hong, J. 2001. Goal Recognition Through Goal Graph Analysis  Journal of Artificial Intelligence Research (JAIR), 15, 1–30.
  • [Kaminka, Vered,  AgmonKaminka et al.2018] Kaminka, G. A., Vered, M.,  Agmon, N. 2018. Plan Recognition in Continuous Domains  In Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence (AAAI 2018).
  • [Katz, Sohrabi, Udrea,  WintererKatz et al.2018] Katz, M., Sohrabi, S., Udrea, O.,  Winterer, D. 2018. A novel iterative approach to top-k planning  In Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS 2018).
  • [Keren, Gal,  KarpasKeren et al.2014] Keren, S., Gal, A.,  Karpas, E. 2014. Goal Recognition Design  In Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS 2014).
  • [Keren, Gal,  KarpasKeren et al.2015] Keren, S., Gal, A.,  Karpas, E. 2015. Goal Recognition Design for Non-Optimal Agents  In Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence (AAAI 2015),  3298–3304.
  • [Keren, Gal,  KarpasKeren et al.2016] Keren, S., Gal, A.,  Karpas, E. 2016. Goal Recognition Design with Non-Observable Actions  In Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence (AAAI 2016),  3152–3158.
  • [Keyder  GeffnerKeyder  Geffner2008] Keyder, E.  Geffner, H. 2008. Heuristics for planning with action costs revisited  In Proceedings of the 14th European Conference on Artificial Intelligence (ECAI 2008).
  • [Keyder, Richter,  HelmertKeyder et al.2010] Keyder, E., Richter, S.,  Helmert, M. 2010. Sound and Complete Landmarks for And/Or Graphs  In ECAI 2010 - 19th European Conference on Artificial Intelligence, Lisbon, Portugal, August 16-20, 2010, Proceedings,  335–340.
  • [Lipovetzky, Muise,  GeffnerLipovetzky et al.2016] Lipovetzky, N., Muise, C.,  Geffner, H. 2016. Traps, Invariants, and Dead-Ends  In Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS 2016).
  • [Masters  SardiñaMasters  Sardiña2017] Masters, P.  Sardiña, S. 2017. Cost-Based Goal Recognition for Path-Planning  In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS),  750–758.
  • [McDermott, Ghallab, Howe, Knoblock, Ram, Veloso, Weld,  WilkinsMcDermott et al.1998] McDermott, D., Ghallab, M., Howe, A., Knoblock, C., Ram, A., Veloso, M., Weld, D.,  Wilkins, D. 1998. PDDL The Planning Domain Definition Language  The Fourth International Conference on Artificial Intelligence Planning Systems 1998 (AIPS’98).
  • [Mirsky, Gal,  ShieberMirsky et al.2017a] Mirsky, R., Gal, Y. K.,  Shieber, S. M. 2017a. CRADLE: An Online Plan Recognition Algorithm for Exploratory Domains  ACM Transactions on Intelligent Systems and Technology (TIST), 8(3), 45:1–45:22.
  • [Mirsky, Gal,  TolpinMirsky et al.2017b] Mirsky, R., Gal, Y. K.,  Tolpin, D. 2017b. Session analysis using plan recognition  In The ICAPS 2017 Workshop on User Interfaces and Scheduling and Planning (UISP@ICAPS).
  • [Mirsky, Stern, Gal,  KalechMirsky et al.2016] Mirsky, R., Stern, R., Gal, Y. K.,  Kalech, M. 2016. Sequential Plan Recognition  In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 2016).
  • [Mirsky, Stern, Ya’akov (Kobi) Gal,  KalechMirsky et al.2017] Mirsky, R., Stern, R., Ya’akov (Kobi) Gal, M. K.,  Kalech, M. 2017. Plan Recognition Design  In Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence (AAAI 2017),  4971–4972.
  • [Pattison  LongPattison  Long2010] Pattison, D.  Long, D. 2010. Domain Independent Goal Recognition.  In Starting AI Researcher Symposium (STAIRS),  222,  238–250. IOS Press.
  • [Pereira, Ritt,  BuriolPereira et al.2015] Pereira, A. G., Ritt, M.,  Buriol, L. S. 2015. Optimal sokoban solving using pattern databases with specific domain knowledge  Artificial Intelligence, 227, 52 – 70.
  • [Pereira  MeneguzziPereira  Meneguzzi2016] Pereira, R. F.  Meneguzzi, F. 2016. Landmark-Based Plan Recognition  In Proceedings of the 22nd European Conference on Artificial Intelligence (ECAI 2016).
  • [Pereira  MeneguzziPereira  Meneguzzi2017] Pereira, R. F.  Meneguzzi, F. 2017. Goal and Plan Recognition Datasets using Classical Planning Domains. At the data repository Zenodo.
  • [Pereira, Oren,  MeneguzziPereira et al.2017] Pereira, R. F., Oren, N.,  Meneguzzi, F. 2017. Landmark-Based Heuristics for Goal Recognition  In Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence (AAAI 2017).
  • [Pynadath  WellmanPynadath  Wellman2013] Pynadath, D. V.  Wellman, M. P. 2013. Accounting for Context in Plan Recognition, with Application to Traffic Monitoring  Computing Research Repository (CoRR), abs/1302.4980.
  • [Ramírez  GeffnerRamírez  Geffner2010] Ramírez, M.  Geffner, H. 2010. Probabilistic Plan Recognition Using Off-the-Shelf Classical Planners  In Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence (AAAI 2010).
  • [Ramírez  GeffnerRamírez  Geffner2009] Ramírez, M.  Geffner, H. 2009. Plan Recognition as Planning  In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 2009).
  • [Richter, Helmert,  WestphalRichter et al.2008] Richter, S., Helmert, M.,  Westphal, M. 2008. Landmarks Revisited  In Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence (AAAI 2008),  975–982. AAAI Press.
  • [Richter  WestphalRichter  Westphal2010] Richter, S.  Westphal, M. 2010. The LAMA Planner: Guiding Cost-based Anytime Planning with Landmarks  Journal of Artificial Intelligence Research (JAIR), 39(1), 127–177.
  • [Shleyfman, Katz, Helmert, Sievers,  WehrleShleyfman et al.2015] Shleyfman, A., Katz, M., Helmert, M., Sievers, S.,  Wehrle, M. 2015. Heuristics and Symmetries in Classical Planning  In Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence (AAAI 2015),  3371–3377.
  • [Sohrabi, Riabov,  UdreaSohrabi et al.2016] Sohrabi, S., Riabov, A. V.,  Udrea, O. 2016. Plan Recognition as Planning Revisited  In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 2016).
  • [Sukthankar, Goldman, Geib, Pynadath,  BuiSukthankar et al.2014] Sukthankar, G., Goldman, R. P., Geib, C., Pynadath, D. V.,  Bui, H. H. 2014. Plan, Activity, and Intent Recognition: Theory and Practice. Elsevier.
  • [Uzan, Dekel, Seri,  GalUzan et al.2015] Uzan, O., Dekel, R., Seri, O.,  Gal, Y. K. 2015. Plan Recognition for Exploratory Learning Environments Using Interleaved Temporal Search  AI Magazine, 36(2), 10–21.
  • [Vered  KaminkaVered  Kaminka2017] Vered, M.  Kaminka, G. A. 2017. Heuristic Online Goal Recognition in Continuous Domains  In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 2017),  4447–4454.
  • [Vered, Kaminka,  BihamVered et al.2016] Vered, M., Kaminka, G. A.,  Biham, S. 2016. Online goal recognition through mirroring: Humans and agents  In Proceedings of the Annual Conference on Advances in Cognitive Systems.
  • [Vered, Pereira, Magnaguagno, Meneguzzi,  KaminkaVered et al.2018] Vered, M., Pereira, R. F., Magnaguagno, M., Meneguzzi, F.,  Kaminka, G. A. 2018. Towards Online Goal Recognition Combining Goal Mirroring and Landmarks  In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2018).
  • [Vidal  GeffnerVidal  Geffner2005] Vidal, V.  Geffner, H. 2005. Solving Simple Planning Problems with More Inference and No Search  In van Beek, P., Principles and Practice of Constraint Programming - CP 2005. Springer Berlin Heidelberg.
  • [Zhu  GivanZhu  Givan2003] Zhu, L.  Givan, R. 2003. Landmark Extraction via Planning Graph Propagation  In Printed Notes of ICAPS’03 Doctoral Consortium.