Whither Programs as Specifications

06/09/2019 ∙ by David A. Naumann, et al. ∙ Stevens Institute of Technology 0

Unifying theories distil common features of programming languages and design methods by means of algebraic operators and their laws. Several practical concerns---e.g., improvement of a program, conformance of code with design, correctness with respect to specified requirements---are subsumed by the beautiful notion that programs and designs are special forms of specification and their relationships are instances of logical implication between specifications. Mathematical development of this idea has been very successful but limited to an impoverished notion of specification: trace properties. Some mathematically precise properties of programs---dubbed hyperproperties---refer to traces collectively. For example, confidentiality involves knowledge of possible traces. This article reports on both obvious and surprising results about lifting algebras of programming to hyperproperties, especially in connection with loops, and suggests directions for further research. The main results are: the first compositional semantics of imperative programs with loops, at the hyper level, and a proof that this semantics is the same as the direct image of a standard semantics, for subset closed hyperproperties.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A book has proper spelling provided that each of its sentences does. For a book to be captivating and suspenseful — that is not a property that can be reduced to a property of its individual sentences. Indeed, few interesting properties of a book are simply a property of all its sentences. By contrast, many interesting requirements of a program can be specified as so-called trace properties: there is some property of traces (i.e., observable behaviors) which must be satisfied by all the program’s traces.

The unruly mess of contemporary programming languages, design tools, and approaches to formal specification has been given a scientific basis through unifying theories that abstract commonalities by means of algebraic operators and laws. Algebra abstracts from computational notions like partiality and nondeterminacy by means of operators that are interpreted as total functions and which enable equational reasoning. Several practical concerns — such as improving a program’s resource usage while not altering its observable behavior, checking conformance of code with design architecture, checking satisfaction of requirements, and equivalence of two differently presented designs — are subsumed by the beautiful notion that programs and designs111This paper was written with the UTP [19] community in mind, but our use of the term “design” is informal and does not refer to the technical notion in UTP. are just kinds of specification and their relationships are instances of logical implication between specifications. Transitivity of implication yields the primary relationship: the traces of a program are included in the traces allowed by its specification. The mathematical development of this idea has been very successful — for trace properties.

Not all requirements are trace properties. A program should be easy to read, consistent with dictates of style, and amenable to revision for adapting to changed requirements. Some though not all such requirements may be addressed by mathematics; e.g., parametric polymorphism is a form of modularity that facilitates revision through reuse. In this paper we are concerned with requirements that are extensional in the sense that they pertain directly to observable behavior. For a simple example, consider a program acting on variables where the initial value of is meant to be a secret, on which the final value of must not depend. Consider this simple notion of program behavior: a state assigns values to variables, and a trace is a pair: the initial and final states. The requirement cannot be specified as a trace property, but it can be specified as follows: for any two traces and , if the initial states and have the same value for then so do the final states. In symbols: .

Some requirements involve more than two traces, e.g., “the average response time is under a millisecond” can be made precise by averaging the response time of each trace, over all traces, perhaps weighted by a distribution that represents likelihood of different requests. For a non-quantitative example, consider the requirement that a process in a distributed system should know which process is the leader: something is known in a given execution if it is true in all possible traces that are consistent with what the process can observe of the given execution (such as a subset of the network messages). In the security literature, some information flow properties are defined by closure conditions on the program’s traces, such as: for any two traces, there exists a trace with the high (confidential) events of the first and the low (public) events of the second.

This paper explores the notion that just as a property of books is a set of books, not necessarily defined simply in terms of their sentences, so too a property of programs is a set of programs, not merely a set of traces. The goal is to investigate how the algebra of programming can be adapted for reasoning about non-trace properties. To this end, we focus on the most rudimentary notion of trace, i.e., pre/post pairs, and rudimentary program constructs. We conjecture that the phenomena and ideas are relevant to a range of models, perhaps even the rich notions of trace abstracted by variations on concurrent Kleene algebra [20].

It is unfortunate that the importance of trace properties in programming has led to well established use of the term “property” for trace property, and recent escalation in terminology to “hyperproperty” to designate the general notion of program property — sets of programs rather than sets of traces [10, 9]. Some distinction is needed, so for clarity and succinctness we follow the crowd. The technical contribution of this paper can now be described as follows: we give a lifting of the fixpoint semantics of loops to the “hyper level”, and show anomalies that occur with other liftings. This enables reasoning at the hyper level with usual fixpoint laws for loops, while retaining consistency with standard relational semantics. Rather than working directly with sets of trace sets, our lifting uses a simpler model, sets of state sets; this serves to illustrate the issues and make connections with other models that may be familiar. The conceptual contribution of the paper is to call attention to the challenge of unifying theories of programming that encompass requirements beyond trace properties.

Outline.

Section 2 describes a relational semantics of imperative programs and defines an example program property that is not a trace property. Relational semantics is connected, in Section 3, with semantics mapping sets to sets, like forward predicate transformers. Section 4 considers semantics mapping sets of sets to the same, this being the level at which hyperproperties are formulated. Anomalies with obvious definitions motivate a more nuanced semantics of loops. The main technical result of the paper is Theorem 4.1 in this section, connecting the semantics of Section 4 with that of Section 3. Section 5 connects the preceding results with the intrinsic notion of satisfaction for hyperproperties, and sketches challenges in realizing the dream of reasoning about hyperproperties using only refinement chains. The semantics and theorem are new, but similar to results in prior work discussed in Section 6. Section 7 concludes.

2 Programs and specifications as binary relations

Preliminaries.

We review some standard notions, to fix notation and set the stage. Throughout the paper we assume is a nonempty set, which stands for the set of program states, or data values, on which programs act. For any sets , , let denote the binary relations from to ; that is, is where means powerset. Unless otherwise mentioned, we consider powersets, including , to be ordered by inclusion ().

We write for the set of functions from to . For composition of relations, and in particular composition of functions, we use infix symbol in the forward direction. Thus for relations and elements we have iff . For a function and element we write application as and let it associate to the left. Composition with is written , as functions are treated as special relations, so . The symbol binds tighter than and other operators.

For a relation , the direct image is a total function defined by iff . It faithfully reflects ordering of relations:

where means pointwise order (i.e., iff ). We write for pointwise union, defined by . The -least element is the function , abbreviated as . A relation can be recovered from its direct image:

(1)

where maps element to singleton set and is the converse of the membership relation. Note that is the direct image of the empty relation. Direct image is functorial and distributes over union:

We write for identity function on the set indicated. In fact distributes over arbitrary union, i.e., sends any union of relations to the pointwise join of their images. Also, is universally disjunctive, and (1) forms a bijection between universally disjunctive functions and relations .

In this paper we use the term transformer for monotonic functions of type . For to be monotonic is equivalent to .

We write for the least-fixpoint operator. For monotonic functions and where are sufficiently complete posets that and exist, the fixpoint fusion rule says that for strict and continuous ,

(2)

Inequational forms, such as , are also important.222Fusion rules, also called fixpoint transfer, can be found in many sources, e.g., [1, 4]. We need the form in Theorem 3 of [12], for Kleene approximation of fixpoints.

Relational semantics.

The relational model suffices for reasoning about terminating executions. If we write to specify a program that increases by at least two, we can write this simple refinement chain:

to express that the nondeterministic choice () between adding 3 or adding 5 refines the specification and is refined in turn by the first alternative. Relations model a good range of operations including relational converse and intersection which are not implementable in general but are useful for expressing specifications. Their algebraic laws facilitate reasoning. For example, choice is modeled as union, so the second step is from a law of set theory: .

Equations and inequations may serve as specifications. For example, to express that relation is deterministic we can write , where is the converse of . Note that this uses two occurrences of . Returning to the example in the introduction, suppose relates states with variables . To formulate the noninterference property that the final value of is independent of the initial value of , it is convenient to define a relation on states that says they have the same value for : define by iff . The property is

This is a form of determinacy. A weaker notion allows multiple outcomes for but the set of possibilities should be independent from the initial value of .

This is known as possibilistic noninterference. It can be expressed without quantifiers, by the usual simulation inequality:

(3)

Another equivalent form is , which again uses two occurrences of . The algebraic formulations are attractive, but recall the beautiful idea of correctness proof as a chain of refinements

This requires the specification to itself be a term in the algebra, rather than an (in)equation between terms.

Before proceeding to investigate this issue, we recall the well known fact that possibilistic noninterference is not closed under refinement of trace sets [21]. Consider ranging over bits, so we can write pairs compactly, and consider the set of traces It satisfies possibilistic noninterference, but if we remove the underlined pairs the result does not; in fact the result copies to .

In the rest of this paper, we focus on deterministic noninterference, for short. It has been advocated as a good notion for security [35] and it serves our purposes as an example.

A signature and its relational model.

To investigate how and other non-trace properties may be expressed and used in refinement chains, it is convenient to focus on a specific signature, the simple imperative language over given atoms (ranged over by ) and boolean expressions (ranged over by ).

(4)

For expository purposes we refrain from decomposing the conditional and iteration constructs in terms of choice () and assertions. That decomposition would be preferred in a more thorough investigation of algebraic laws, and it is evident in the semantic definitions to follow.

Assume that for each is given a relation , and for each boolean expression is given a coreflexive relation . That is, is a subset of the identity relation on . For non-atom commands the relational semantics is defined in Fig. 1. The fixpoint for loops333 It is well known that loops are expressible in terms of recursion: can be expressed as and this is the form we use in semantics. A well known law is which factors out the termination condition. is in , ordered by with least element .

Figure 1: Relational semantics , with assumed to be given.

The language goes beyond ordinary programs, in the sense that atoms are allowed to be unboundedly nondeterministic. They are also allowed to be partial; coreflexive atoms serve as assume and assert statements. Other ingredients are needed for a full calculus of specifications, but here our aim is to sketch ideas that merit elaboration in a more comprehensive theory.

3 Programs as forward predicate transformers

Here is yet another way to specify for a relation :

where says that all elements of agree on :

As with the preceding (in)equational formulations, like (3), this is not directly applicable as the specification in a refinement chain, but it does hint that escalating to sets of states may be helpful. Note that occurs just once in the condition.

Weakest-precondition predicate transformers are a good model for programming algebra: Monotonic functions can model total correctness specifications with both angelic and demonic nondeterminacy. In this paper we use transformers to model programs in the forward direction.

For boolean expression we define so that is a filter: is in iff and is true of . The transformer semantics is in Fig. 2. For loops, the fixpoint is for the aforementioned and .

Figure 2: Transformer semantics .

Linking transformer with relational.

The transformer model may support a richer range of operators than the relational one, but for several reasons it is important to establish their mutual consistency on a common set of operators [18, 19]. A relation can be recovered from its direct image, see (1), so the following is a strong link.

Proposition 1

For all in the signature, .

Proof

By induction on .

  • : by definitions and law.

  • : by definition.

  • : by definitions, laws, and induction hypothesis.

  • : by definitions, laws, and induction hypothesis.

  • : by definitions, laws, and induction hypothesis.

  • : To prove , unfold the definitions to , where are defined in Figs 1 and 2. This follows by fixpoint fusion, taking in (2) to be so the antecedent to be proved is . Observe for any :

Subsets of , such as transformers satisfying Dijkstra’s healthiness conditions, validate stronger laws than the full set of (monotonic) transformers. Healthiness conditions can be expressed by inequations, such as the determinacy inequation , and used as antecedents in algebraic laws. Care must be taken with joins: not all subsets are closed under pointwise union. Pointwise union does provide joins in the set of all transformers and also in the set of all universally disjunctive transformers.

In addition to transformers as weakest preconditions [4], another similar model is multirelations which are attractive in maintaining a pre-to-post direction [25]. These are all limited to trace properties, though, so we proceed in a different direction.

4 Programs as h-transformers

Given , the image is a function and functions are relations, so the direct image can be taken: where abbreviates . In this paper, monotonic functions of this type are called h-transformers, in a nod to hyper terminology.

The underlying relation can be recovered by two applications of (1):

More to the point, a quantifier-free formulation of is now in reach. Recall that we have iff . This is equivalent to

(5)

where the set of sets is defined by . This is one motivation to investigate as a model, rather than which is the obvious way to embody the idea that a program is a trace set and a property is a set of programs.

In the following we continue to write and for the pointwise join and pointwise order on . Please note the order is defined in terms of set inclusion at the outer layer of sets and is independent of the order on . Define and note that and for .

Surprises.

For semantics using h-transformers, some obvious guesses work fine but others do not. The semantics in Fig. 3 uses operators , and which will be explained in due course. For boolean expressions we simply lift by direct image, defining . The same for command atoms, so the semantics of is derived from the given .

Figure 3: H-transformer semantics .

The analog of Proposition 1 is that for all in the signature, , allowing laws valid in relational semantics to be lifted to h-transformers. Considering some cases suggests that this could be proved by induction on :

  • : by definitions and using that preserves identity.

  • : by definition.

  • : by definitions, distribution of over , and putative induction hypothesis.

These calculations suggest we may succeed with this obvious guess:

(6)

The induction hypothesis would give . On the other hand, . Unfortunately these are quite different because the joins are at different levels. In general, for and of type and we have whereas . Indeed, the same discrepancy would arise if we define .

At this point one may investigate notions of “inner join”, but for expository purposes we proceed to consider a putative definition for loops. Following the pattern for relational and transformer semantics, an obvious guess is

(7)

Consider this program: . We can safely assume is and is . As there is a single variable, we can represent a state by its value, for example is a set of two states. Let us work out . Now is the limit of the chain where means applications of . Note that for any and ,

Writing for one can derive

at which point the sequence remains fixed. As in the case of conditional (6), the result is not consistent with the underlying semantics:

The result should be if we are to have the analog of Proposition 1.

A plausible inner join is defined by . This can be used to define a semantics of as well as semantics of conditional and loop; the resulting constructs are -monotonic and enjoy other nice properties.

Indeed, using in place of in (7), we get , which is exactly the lift of the transformer semantics. There is one serious problem: fails to be increasing. In particular, ; for example but . While this semantics merits further study, we leave it aside because we aim to use fixpoint fusion results that rely on Kleene approximation: This requires in order to have an ascending chain, and the use of so that is strict.

A viable solution.

Replacing singleton by powerset in the definition of , for any h-transformers we define the inner join by

For semantics of conditionals, it is convenient to define, for boolean expression , this operator on h-transformers: . It satisfies

(8)

because ). These operators are used in Fig. 3 for semantics of conditional and loop.

It is straightforward to prove is monotonic: and imply . It is also straightforward to prove

(9)

but in general equality does not hold, so we focus on .

Lemma 1

For any , is monotonic: and imply .

Proof

Keep in mind this is at the outer level: means (more sets, not bigger sets, if you will). This follows by monotonicity of , or using characterization (8) we have

which implies by and . ∎

With defined as in Fig. 3 we have the following refinement.

Lemma 2

provided that and .

Proof

This result suggests that we might be able to prove for all , but that would be a weak link between the transformer and h-transformer semantics. A stronger link can be forged as follows.

We say is subset closed iff where the subset closure operator is defined by iff . For example, the set used in (5) is subset closed. Observe that .

Lemma 3

For transformers and condition , if then .

Proof

For the LHS, by definitions:

For the RHS, again by definitions:

Now by instantiating and , so LHS RHS is proved—as expected, given (9). If is subset closed, we get as follows. Given in , let . Then and because and are filters. And by subset closure. Taking in completes the proof of RHS LHS. ∎

Preservation of subset closure.

In light of Lemma 3, we aim to restrict attention to h-transformers on subset closed sets. To this end we introduce a few notations. The subset-closed powerset operator is defined on powersets , by

(10)

To restrict attention to h-transformers of type we must show that subset closure is preserved by the semantic constructs.

For any transformer , define iff . The acronym is explained by the lemma to follow. By definitions, the inclusion is equivalent to

(11)

Recall from Section 2 that the reverse, , is monotonicity of .

Lemma 4

implies preserves subset closure.

Proof

For any subset closed , is subset closed because using functoriality of , , and . ∎

It is straightforward to show . The following is a key fact, but also a disappointment that leads us away from nondeterminacy.

Lemma 5

If is a partial function (i.e., ) then .

Proof

In accord with (11) we show for any that . Suppose . Let , so for any we have iff and . We have and it remains to show , which holds because for any

Using dots to show domain and range elements, the diagram on the left is an example such that but is not a partial function. The diagram on the right is a relation, the image of which does not satisfy .

∙[r,””,dash] & ∙

∙[r,””,dash] [ru,””,dash] & ∙

∙[ru,””,dash] & ∙          ∙[r,””,dash] [rd,””,dash] & ∙

∙[r,””,dash] [ru,””,dash] & ∙

∙& ∙

As a consequence of Lemmas 4 and 5 we have the following.

Lemma 6

If is a partial function then preserves subset closure.

The theorem.

To prove , we want to identify a subset of satisfying two criteria. First, can be defined within it, so in particular it is closed under in Figure 2. Second, on the subset, is strict and continuous into , to enable the use of fixpoint fusion. Strictness is the reason444In [10], other reasons are given for using rather than as the false hyperproperty. to disallow the empty set in (10); it makes (which equals ) the least element, whereas otherwise the least element would be . We need the subset to be closed under pointwise union, at least for chains, so that is continuous.

Given that is universally disjunctive for any , Proposition 1 suggests restricting to universally disjunctive transformers. Lemma 4 suggests restricting to transformers satisfying . But we were not able to show the universally disjunctive transformers satisfying are closed under limits. We proceed as follows.

Define and note that where is the usual domain of a relation. By a straightforward proof we have:

Lemma 7

For universally disjunctive and any we have .

Lemma 8

For universally disjunctive with , if and then .

Proof

For any with we need to show . First observe

We use to show that witnesses , as follows: using also the definition of , and and from Lemma 7 and . ∎

Lemma 9

If and preserve subset closure then is subset closed (regardless of whether is).

Proof

Suppose is in and . So according to (8) there are with , , , and . Let and , so and . Because powersets are subset closed, and are subset closed, hence and . As , we have . ∎

It is straightforward to prove that preserves subset closure if do, similar to the proof of Lemma 9. By contrast, does not preserve subset closure even if and do.

Next, we confirm that can be defined within the monotonic functions .

Lemma 10

For all , if is subset closed then so is , provided that for every .

Proof

By induction on .

  • : is so by assumption and Lemma 4.

  • : immediate.

  • : by definitions and induction hypothesis.

  • : by induction hypothesis and observation above about .

  • : by Lemma 9 and induction hypothesis.

  • : Because is least in , we have , so using monotonicity of we have Kleene iterates. Suppose is subset closed. To show is subset closed, note that where is some ordinal. We show that is subset closed, for every up to , by ordinal induction.

    • which is subset closed.

    • by definition of . Now preserves subset closure by the ordinal induction hypothesis, and preserves subset closure by the main induction hypothesis. So preserves subset closure, as does . Hence preserves subset closure by Lemma 9.

    • (for non-0 limit ordinal ), which in turn equals because is pointwise. By induction, each is subset closed, and closure is preserved by union, so we are done. ∎

Returning to the two criteria for a subset of , suppose is a partial function, for all — in short, atoms are deterministic. If in addition is -free, then is a partial function. Under these conditions, by Proposition 1, is the direct image of a partial function.

Let be the subset of that are direct images of partial functions, i.e., . Observe that is closed under , because for with we have and the union is of partial functions with disjoint domains so it is a partial function. We have because is the least element in . By Lemma 6, when is restricted to , its range is included in . In , lubs of chains are given by pointwise union, so is a strict and continuous function from to .

To state the theorem, we write for extensional equality on h-transformers of type , i.e., equal results on all subset closed .

Theorem 4.1

, provided atoms are deterministic and is -free.

Proof

By induction on . For the cases of , atoms, and the arguments preceding (6) are still valid. For conditional, observe

Finally, the loop:

The antecedent for fusion is and it holds because for any :

5 Specifications and refinement

We wish to conceive of specifications as miraculous programs that can achieve by refusing to do, can choose the best angelically, and can compute the uncomputable. We wish to establish rigorous connections between programs and specifications, perhaps by deriving a program that can be automatically compiled for execution, perhaps by deriving a specification that can be inspected to determine the usefulness or trustworthiness of the program. A good theory may enable automatic derivation in one direction or the other, but should also account for ad hoc construction of proofs. Simple reasons should be expressed simply, so algebraic laws and transitive refinement chains are important. In this inconclusive section, we return to the general notion of hyperproperty and consider how the h-transformer semantics sheds light on refinement for hyperproperties. Initially we leave aside the signature/semantics notations.

Let be considered as a program, and be a hyperproperty, that is, is a set of programs. Formally: . For to satisfy means, by definition, that . The example of possibilistic noninterference shows that in general trace refinement is unsound: does not follow from and . It does follow in the case that is subset closed. Given that is a huge space, one may hope that specifications of practice interest may lie in relatively tame subsets. Let us focus on subset closed hyperproperties, for which one form of chain looks like

(12)

Although this is a sound way to prove , it does not seem sufficient, at least for examples like which require some degree of determinacy. The problem is that for intermediate steps of trace refinement it is helpful to use nondeterminacy for the sake of abstraction and underspecification, so finding suitable and may be difficult. One approach to this problem is to use a more nuanced notion of refinement, that preserves a hyperproperty of interest. For confidentiality, Banks and Jacob explore this approach in the setting of UTP [6].

Another form of chain looks like