Finding plans subject to stipulations on what information they divulge

09/25/2018 ∙ by Yulin Zhang, et al. ∙ 0

Motivated by applications where privacy is important, we consider planning problems for robots acting in the presence of an observer. We first formulate and then solve planning problems subject to stipulations on the information divulged during plan execution---the appropriate solution concept being both a plan and an information disclosure policy. We pose this class of problem under a worst-case model within the framework of procrustean graphs, formulating the disclosure policy as a particular type of map on edge labels. We devise algorithms that, given a planning problem supplemented with an information stipulation, can find a plan, associated disclosure policy, or both if some exists. Both the plan and associated disclosure policy may depend subtlety on additional information available to the observer, such as whether the observer knows the robot's plan (e.g., leaked via a side-channel). Our implementation finds a plan and a suitable disclosure policy, jointly, when any such pair exists, albeit for small problem instances.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Last year, iRobot announced that they intended to sell maps of people’s homes, as generated by their robot vacuum cleaners. The result was a public outcry [1]. It is increasingly clear that, as robots become part of our everyday lives, the information they could collect (indeed, may need

to collect to function) can be both sensitive and valuable. Information about a robot’s internal state and its estimates of the world’s state are leaked by status displays, logged data, actions executed, and information outputted—often what the robot is tasked with doing. The tension between utility and privacy is fundamental.

Typically, robots strive to decrease uncertainty. Some prior work, albeit limited, has illustrated how to cultivate uncertainty, examining how to constrain a robot’s beliefs so that it never learns sensitive information (cf. [2, 3, 4]). In so doing, one precludes sensitive information being disclosed to any adversary. But not disclosing secrets by simply never knowing any, severely limits the applicability of the approach. This paper proposes a more general, wider-reaching model for privacy, beyond mere ingénue robots.

This work posits a potentially adversarial observer and then stipulates properties of what shall be divulged. The stipulation describes information that (being required to perform the task) must be communicated as well as information (violating the user’s privacy) that shouldn’t be. Practical scenarios where this model applies include: () privacy-aware care robots that assist the housebound, providing nursing care; () inspection of sensitive facilities by robots to certify compliance with regulatory agreements, whilst protecting other proprietary or secret details; () sending data remotely to computing services operating on untrusted cloud infrastructure.

Pebble bed facility Breeder reactor
Figure 1: Nuclear Site Inspection  A robot inspects a nuclear facility by taking a measurement at the ‘?’ location, which depends on the facility type. The type of facility is sensitive information that it must not be divulged to any external observers.

Figure 1 illustrates a scenario which, though simplistic, is rich enough to depict several aspects of the problem. The task requires that a robot determine whether some facility’s processing of raw radioactive material meets international treaty requirements or not. The measurement procedure itself depends on the type of facility as the differing physical arrangements of ‘pebble bed’ and ‘breeder’ reactors necessitate different actions. First, the robot must actively determine the facility type (checking for the presence of the telltale blue light in the correct spot). Then it can go to a location to make the measurement, the location corresponding with the facility type. But the facility type is sensitive information—the robot must ascertain the radioactivity state and ensure that the facility type is not disclosed.

What makes the present scenario interesting is the task is immediately rendered infeasible if one prescribes a policy to ensure that the robot never gains sensitive information. Over and above the classical question of how to balance information-gathering and progress-making actions, the robot must control what it divulges, strategically increasing uncertainty as needed, precisely limiting (and reasoning about) the ‘knowledge gap’ between the external observer and itself. To solve such a problem, the robot needs a carefully constructed plan and should establish a policy characterizing what information it divulges, the former achieving the goals set for the robot, the latter respecting all stipulated constraints—and, of course, each depending on the other.

1.1 Contributions and itinerary

This paper contributes the first formulation, to our knowledge, of planning where solutions can be constrained to require that some information be communicated and other information obscured subject to an adversarial model of an observer. Nor do we know of other work where both a plan and some notion of a interface (the disclosure policy in our terminology) can both be solved for jointly. The paper is organized as follows: after discussion of related work, Section 3 develops the preliminaries, notation, and formalism, Section 4 gives the definition of a satisfying plan, and Section 6 treats finding plans. Sandwiched in-between, Section 5 addresses an important technical detail regarding an observer’s background knowledge. The last section reports experiments conducted with our implementation. (Technicalities appear in supplementary materials.)

2 Related work

An important topic in HRI is expressive action (e.g., see [5]). In recent years there has been a great deal of interest in mathematical models that enable generation of communicative plans. Important formulations include those of [6, 7], proposing plausible models for human observers (from the perspectives of presumed cost efficiency, surprisal, or generalizations thereof). In this prior work, conveying information becomes part of an optimization objective, whereas we treat it as a constraint instead. Both [6, 7] are probabilistic in nature, here we consider a worst-case model that is more suitable for privacy considerations: We ask what an observer can plausibly infer via the history of its received observations. In doing so, we are influenced by the philosophy of LaValle [8], following his use of the term information state (I-state) to refer to a representation of information derived from a history of observations. Finally, since parts of our stipulations may require concealing information, we point out there is also recent work in deception (see [9, 10]) and also obfuscation [11].

3 The model: worlds, robots and observers

Figure 2 illustrates the three-way relationships underlying the setting we study. Most fundamentally, a robot executes a plan to achieve some goal in the world, and the coupling of these two elements generates a stream of observations and actions. Both the plan and the action–observation stream are disclosed, though potentially only partially, to a third party, an observer. The observer uses the stream, its knowledge of the plan, and also other known structure to infer properties about the interaction. Additionally, a stipulation is provided specifying particular properties that can be learned by the observer. We formalize these elements in terms of p-graphs and label maps (see [12]).

Figure 2: An overview of the setting: the robot is modeled abstractly as realizing a plan to achieve some goal in the world and the third party observer as a filter. All three, the world, the plan, and the filter have concrete representations as p-graphs.

3.1 P-graph and its interaction language

We will start with the definition of p-graphs [12] and related properties: [p-graph] A p-graph is an edge-labelled directed bipartite graph with , where

  • the finite vertex set , whose elements are also called states, can be partitioned into two disjoint subsets: the observation vertices and the action vertices ,

  • each edge originating at an observation vertex bears a set of observations , containing observation labels, and leads to an action vertex,

  • each edge originating at an action vertex bears with a set of actions , containing action labels, and leads to an observation vertex, and

  • a non-empty set of states are designated as initial states, which may be either exclusively action states () or exclusively observation states ().

An event is an action or an observation, and altogether they comprise the sets and , which are called the p-graph’s action space and observation space, respectively. Abusing notation, we will also write and for the observation space and action space of occasionally. Similarly, the initial states will be written .

Intuitively, a p-graph abstractly represents a (potentially non-deterministic) transition system where transitions are either of type “action” or “observation” and these two alternate. The following definitions make this idea precise. [transitions to] For a given p-graph and two states , a sequence of events transitions in from to if there exists a sequence of states , such that , , and for each , there exists an edge for which .

Let the concise predicate hold if there is some way of tracing on from to , i.e., it is iff transitions to under execution . Note, when has non-deterministic transitions, may transition to multiple vertices under the same execution. We only require that be one of them. [executions and interaction language] An execution on a p-graph is a finite sequence of events , if there exists some and some for which . The set of all executions on is called the interaction language (or, briefly, just language) of and is written .

Given any edge , if or , we speak of bearing the set . [joint-execution] A joint-execution on two p-graphs and is a sequence of labels that is an execution of both and , written as . The p-graph producing all the joint-executions of and

is their tensor product graph with initial states

, which we denote .

A vertex from is as a pair , where and . Next, the relationship between the executions and vertices is established. The set of vertices reached by execution in , denoted , are the vertices to which the execution transitions, starting at an initial state. Symbolically, Further, the set of executions reaching vertex in is written as . The mnemonic here is that describes sets of vertices, describes sets of strings/ex-ecutions. The collection of sets can be used to form an equivalence relation over executions, under which if and only if . This equivalence relation partitions the executions in into a set of non-empty equivalence classes: , where each equivalence class is and is a representative execution in . The intuition is that any two executions that transition to identical sets of vertices are, in an important sense, indistinguishable.

[state-determined] A p-graph is in a state-determined presentation, or is in state-determined form, if .

An algorithm to expand any p-graph into a state-determined presentation is given as Algorithm 1 in Section J of the supplementary document. The language of p-graphs is not affected by state-determined expansion, i.e. .

Note that in the preceding discussion the covering turned into a partition when we considered all the vertices reached by a string (since ), not just whether a vertex can be reached by a string. Any string covered by both and means that, whatever may be, both and . It is easy to show that is the collection of all those s whose contain . One may start with vertices and ask about the executions reaching those vertices. (Later, this will be part of how an observer makes inferences about the world.)

Given any set of vertices in p-graph , the set of executions that reach exactly (i.e. reach and reach only) is . Above, the represents the set of executions that reach every vertex in . By subtracting the ones that also reach the vertices outside , describes the set of executions that reach exactly . In Figure 3, the executions reaching are represented as . But the executions reaching and reaching only are since also reaches . Specifically, the equivalence class contains the executions that reach exactly , so we have .

Figure 3: An example showing the difference between ‘reaches’ and ‘reaches exactly’ as distinguished in notation as and .

3.2 Planning problems and plans

In the p-graph formalism, planning problems and plans are defined as follows [12].

[planning problems and plans] A planning problem is a p-graph along with a goal region ; a plan is a p-graph equipped with a termination region .

Planning problem is solved by some plan if the plan always terminates (i.e., reaches ) and only terminates at a goal. Said with more precision:

[solves] A plan solves a planning problem if there is some integer which bounds length of all joint-executions, and for each joint-execution and any pair of nodes reached by that execution simultaneously, the following conditions hold:

  • if and are both action nodes and, for every label borne by each edge originating at , there exist edges originating at bearing the same action label;

  • if and are both observation nodes and, for every label borne by each edge originating at , there exist edges originating at bearing the same observation label;

  • if and then ;

  • if then some extended joint-execution exists, continuing from and , that does reach the termination region.

In the above, properties 1) and 2) describe a notion of safety; property 3) of correctness; and 4) of liveness. In the previous definition, there is an upper bound on joint-execution length. We say that plan is -bounded if, , .

3.3 Information disclosure policy, divulged plan and observer

The observer sees a stream of the robot’s actions and observations, and uses them to build estimates (or to compute general properties) of the robot’s interaction with the world. But the observer’s access to this information will usually be imperfect—either by design, as a consequence of real-world imperfections, or some combination of both. Conceptually, this is a form of partial observability in which the stream of symbols emitted as part of the robot’s execution is distorted into to the symbols seen by the observer (see Figure 4). For example, if some pairs of actions are indistinguishable from the perspective of the observer, this may be expressed with a function that maps those pairs of actions to the same value. In this paper, this barrier is what we have been referring to (informally, thus far) with the phrase information disclosure policy. It is formalized as a mapping from the events in the robot’s true execution in the world p-graph to the events received by the observer.

Figure 4: The information disclosure policy, divulged plan and information stipulation. Even when the observer is a strong adversary, the disclosure policy and divulged plan can limit the observer’s capabilities effectively.

[Information disclosure policy] An information disclosure policy is a label map on p-graph , mapping from elements in the combined observation and action space to some set of events .

The word “policy” hints at two interpretations: first, as something given as a predetermined arrangement (that is, as a rule); secondly, as something to be sought (as in finding a policy to solve a decision problem). Both senses apply in the present work; the exact transformation describing the disclosure of information will be used first (in Section 4) as a specification and then, later (in Section 6.3) as something which planning algorithms can produce. How the information disclosure policy is realized in some setting depends on which sense is apt: it can be interpreted as describing observers (showing that for those observers unable to tell from , the stipulations can be met), or it can inform robot operation (the stipulations require that the robot obfuscate and via means such as explicit concealment, sleight-of-hand, misdirection, etc.)

The observer, in addition, may also have imperfect knowledge of robot’s plan, which is leaked or communicated from the side-channel. The disclosed plan is also modeled as a p-graph, which may be weaker than knowing the actual plan. A variety of different types of divulged plan are introduced later (in Section 5) to model different prior knowledge available to an observer; as we will show, despite their differences, they can be treated in a single unified way.

The next step is to provide formal definitions for the ideas just described. In the following, we refer to as the map from the set to some set , and refer to its preimage as the map from to subsets of . The notation for a label map and its preimage is extended in the usual way to sequences and sets: we consider sets of events, executions (being sequences), and sets of executions. They are also extended to p-graphs in the obvious way, by applying the function to all edges. (Elaboration, if in doubt, appears in the supplementary material, Section A.)

[I-state graph] For planning problem , plan and information disclosure policy , an observer’s I-state graph is a p-graph, whose inputs are from the image space of (i.e., ), with . The action space and observation space of are also written as and .

The observer’s I-state graph is a p-graph with events in , the outputs of , which, for sake of brevity, we will refer to simply as ‘the image space’. By having , we are requiring that strings generated in the world can be safely traced on . Whether and are disjoint or not depends on their initial disjointedness and the structure of .

Using the notation above, we will frequently speak of . Next, are some basic properties of the vertices and executions of .

[Properties of ] Given and , the following hold:

  1. .

  2. , .

  3. .

  4. .

Proofs appear in the supplementary material, Section B.1.

Next we present a core definition of the paper. The crucial aspect to be formalized is the connection from the interaction of the robot and world, via the stream of symbols generated, to the state tracked by the observer. Inference runs from observer back to the world, but causality proceeds from the robot–world to observer (glance again at Figure 2). We begin, accordingly, with that latter direction.

[compatible world states] Given observer I-state graph , robot’s plan , world graph , and label map , the world state is compatible with the set of I-states if such that .

Informally, each of the three terms can be interpreted as:

  1. An observer with I-state graph may ask which sequences are responsible for having arrived at states . The answer is the set , an equivalence class of strings identical up to those states, the executions contained therein being indistinguishable up to states in . Those strings are in the image space , so, to obtain an answer in the world , their preimages must be taken. Every execution in leads the observer to . (Note that information is degraded both by and , an example to clarify this appears in Figure B-8 in the supplementary materials.)

  2. The set of executions that may be executed by the robot is represented by . If the observer knows that the robot’s plan takes, say, the high road, this information allows the observer to remove executions involving the robot along the low road.

  3. The set of executions reaching world state is represented by . Two world states are essentially indiscernible or indistinguishable if , as the sets capture the intrinsic uncertainty of world .

When an observer is in , and is compatible with , there exists some execution, a certificate, that the world could plausibly be in subject to (1) the current information summarized in ; (2) the robot’s plan; (3) the structure of the world. The set of all world states that are compatible with is denoted , which is the observer’s estimate of the world states when known information about , and have all been incorporated.

A typical observer may know less about the robot’s future behavior than the robot’s full plan. Weaker knowledge of how the robot will behave can be expressed in terms of some p-graph , such that . Here the mnemonic is that it is the divulged information about the robot’s plan, which one might imagine as leaked or communicated via a side-channel. Note that is divulged in the preimage space.111We assume this because it leads to a stronger adversary; if the plan information is divulged in the image space some additional degradation of knowledge occurs—results showing this appear in the supplementary material, Section C.

Definition 4 requires the simple substitution of the second term in the intersection with . When only is given, one can only approximate with :

[estimated world states] Given an I-state graph , divulged plan p-graph , world p-graph , and label map , the set of estimated world states for I-states is .

Note that has been replaced with , via Lemma 4.iii.

The last remaining element in Figure 4 that needs to be addressed is the stipulation of information. We do that next.

3.4 Information stipulations

We prescribe properties of the information that an observer may extract from its input by imposing constraints on the sets of estimated world states. The observer, filtering a stream of inputs sequentially, forms a correspondence between its I-states and world states. We write propositional formulas with semantics defined in terms of this correspondence—in this model the stipulations are written to hold over every reachable set of associated states.222We foresee other variants which are straightforward to modifications to consider; but we report only on our current implementation. (Within the supplementary material, Figure D-10 summarizes the semantics).

First, however, we must delineate the scope of the estimated world states to be constrained. Some states, in inherently non-deterministic worlds, may be inseparable because they are reached by the same execution. In Figure 3, both and will be reached (non-deterministically) by execution . Since this is intrinsic to the world, even when the observer has perfect observations, they remain indistinguishable. In the remainder of this paper, we will assume that the world graph is in state-determined form, and we may affix stipulations to the world states knowing that no two vertices will be non-deterministically reached by the same execution.

Suppose, for example, we wish to require that state be included in the observer’s estimates of the world whenever is; this would be expressed via . Evaluation of such formulas takes place as follows. For a set of I-states reached under some operation of the robot in the world, is connected with in that evaluates to for iff , where is the set of estimated world states for I-states . The opposite condition, where , is written naturally as . Standard connectives , , enable composite expressions for complex stipulations to be built.

Let the predicate denote whether the stipulation holds for I-states . Then a plan satisfies the stipulations, if and only if

4 Verifying plans and stipulations: the Check problem

Given everything involved, an important initial step is to successfully recognize a solution to the problem, including determining whether the constraints have been met.

Problem: Check Input: A planning problem , a plan , a divulged plan p-graph , an I-state graph , an information disclosure policy and an information stipulation . Output: if plan solves the problem , and , the information stipulation is always evaluated as on (i.e. ); otherwise.

One thing demands further explanation: is used as a replacement for the world ; this deals with a technical nuance, a sort of inference gained for free upfront, most easily handled by transforming the inputs. Given a world and a disclosed p-graph, oftentimes certain parts of the p-graph can be determined to be irrelevant a priori—for instance, if we know, via , that the robot is executing a plan, then all non-goal cul-de-sacs can be excised (yielding a with ).

4.1 Does the plan solve the planning problem?

To determine whether the plan solves the planning problem , safety, correctness and liveness must all be checked. The procedure, which involves some care but is straightforward, is deferred to Section J of the supplement.

4.2 Are the stipulations satisfied?

An important preliminary step to determine whether the stipulation is satisfied is to establish the correspondence from sets of observer I-states to estimated world states. To accomplish this, for sets of I-states, we compute .

First, one examines those sets of I-states that can arise by dint of the observer perceiving the image of executions under . According to Definition 3.1, those sets correspond to equivalence classes of the images of executions. We can obtain exactly the sets of I-states of interest by expanding into its state-determined form , the expansion process produces a single new state for each equivalence class. Following this, the preimage p-graph will also be in state-determined form.

Figure 5: An example of product graph formed from some , , and .

Next, we find the estimated world states for each vertex in , by simply realizing Definition 4 constructively: a world state corresponds with an observer I-state if there exists a joint-execution in , and that reaches I-state , , and some plan state. Note that and