Chapter 3 - Definitions

Consider the kinship dataset shown below. The situation here is the same as that described in Chapter 2. Art is the parent of Bob and Bea. Bob is the parent of Cal and Cam. Bea is the parent of Cat and Coe.

Suppose now that we wanted to express information about the grandparent relation as well as the parent relation. As illustrated in the preceding chapter, we can do this by adding facts to our dataset. In this case, we would add the facts shown below. Art is the grandparent of Cal and Cam and Cat and Coe.

Unfortunately, doing things this way is wasteful. The grandparent relation can be defined in terms of the parent relation, and so storing grandparent data as well as parent data is redundant.

A better alternative is to write rules to encode such definitions and to use these rules to compute the relations defined by these rules when needed. For example, in the case above, rather than adding grandparent facts to our dataset, we can write the following rule, safe in the knowledge that we can use the rule to compute our grandparent data.

In what follows, we distinguish two different types of relations - base relations and view relations. We encode information about base relations by writing facts in a dataset, and we define view relations by writing rules in a ruleset. In our example, parent is a base relation, and grandparent is a view relation.

Given a dataset defining our base relations and a ruleset defining our view relations, we can use automated reasoning tools to derive facts about our view relations. For example, given the preceding facts about the parent relation and our rule defining the grandparent relation, we can compute the facts about the grandparent relation.

Using rules to define view relations has multiple advantages over encoding those relations in the form of datasets. First of all, as we have just seen, there is economy: if view relations are defined in terms of rules, we do not need to store as many facts in our datasets. Second, there is less chance of things getting out of sync, e.g. if we change the parent relation and forget to change the grandparent relation. Third, view definitions work for any number of objects; they even work for applications with infinitely many objects (e.g. the integers) without requiring infinite storage.

3.2 Syntax

The language of view definitions includes the language of datasets but provides some additional features that make it more expressive, viz. variables and rules. Variables allow us to write fill-in-the-blanks queries. Query rules allow us to express compound queries, notably negations (to say that a condition is false), conjunctions (to say that several conditions are all true), and disjunctions (to say that at least one of several conditions is true).

In our query language, a variable is either a lone underscore or a string of letters, digits, and underscores beginning with an upper case letter. For example, _, X23, X_23, and Somebody are all variables.

An atomic sentence, or atom, is analogous to a factoid in a dataset except that the arguments may include variables as well as symbols. For example, if p is a binary predicate and a is a symbol and Y is a variable, then p(a,Y)is an atomic sentence.

A literal is either an atom or a negation of an atom. A simple atom is called a positive literal, The negation of an atom is called a negative literal. In what follows, we write negative literals using the negation sign ~. For example, if p(a,b) is an atom, then ~p(a,b) denotes the negation of this atom. Both are literals.

A rule is an expression consisting of a distinguished atom, called the head and a conjunction of zero or more literals, called the body. The literals in the body are called subgoals. In what follows, we write rules as in the example shown below. Here, r(X,Y) is the head; p(X,Y) & ~q(Y) is the body; and p(X,Y) and ~q(Y) are subgoals.

In what follows, we write rules as in the example shown below. Here, r(a,b) is the head; p(a,b) & ~q(b) is the body; and p(a,b) and ~q(b) are subgoals.

As we shall see in the next section, a query rule is something like a reverse implication - it is a statement that the head of the rule (i.e. the overall goal) is true whenever the subgoals are true. For example, the rule above states that goal(a,b) is true if p(a,b) is true and q(b) is not true.

The expressive power of query rules is greatly enhanced through the use of variables. Consider, for example, the rule shown below. This is a more general version of the rule shown above. Instead of applying to just the specific objects a and b it applies to all objects. In this case, the rule states that r is true of any object X and any object Y if p is true of X and Y and q is not true of Y.

One noteworthy feature of the syntax of view definitions is that we can define multiple relations in a single set of rules. For example, the following rules define both the f relation and the m relation in terms of p and q.

A second feature is that we can use view relations in defining other view relations. For example, in the following rule, we use the view relation f in our definition of g.

A third feature is that views can be used in their own definitions, thus allowing us to define relations recursively. For example, the following rules define r to be the transitive closure of p.

Unfortunately, as described thus far, our language allows for rulesets with some unpleasant properties. To avoid these problems, it is good practice to comply with some syntactic restrictions on our datasets and rulesets, viz. compatibility, semipositivity, and safety (which we describe later).

A ruleset is compatible with a dataset if and only if (1) all symbols shared between the dataset and the ruleset are of the same type (symbol or predicate), (2) all predicates have the same arity, and (3) none of the predicates in the dataset appear in the heads of any rules in the ruleset.

A semipositive rule is one in which negations apply only to base relations, i.e. there are no subgoals with negated views. A semipositive ruleset is on in which all rules are semipositive.

A rule is safe if and only if every variable that appears in the head or in any negative literal in the body also appears in at least one positive literal in the body.

The rule shown below is safe. Every variable in the head and every variable in the negative subgoal appears in a positive subgoal in the body. Note that it is okay for the body to contain variables that do not appear in the head.

By contrast, the two rules shown below are not safe. The first rule is not safe because the variable Z appears in the head but does not appear in any positive subgoal. The second rule is not safe because the variable Z appears in a negative subgoal but not in any positive subgoal.

Some view definition languages allow rulesets that violate these restrictions. This provides greater expressiveness. However, the meaning of rules in such languages is much ore complicated than in our language, and the cost of using such rules is potentially much greater. Fortunately, for our introductory look at Computational Law, we can get by with a more restricted language, and we are spared unnecessary theoretical complexity and practical inefficiency.

3.3 Semantics

The semantics of a ruleset in our language can be formalized by defining the result of applying the view definitions in the ruleset to the facts in a compatible dataset. We use the word extension to refer to the set of all facts that can be "deduced" in this way.

An instance of an expression (atom, literal, or rule) is one in which all variables have been consistently replaced by ground terms. For example, if we have a language with object constants a and b, then r(a) :- p(a,a), r(a) :- p(a,b), r(b) :- p(b,a), and r(b) :- p(b,b) are all instances of r(X) :- p(X,Y).

Given this notion, we define the result of the application of a single rule to a dataset as follows. Given a rule r and a dataset Δ, let v(r,Δ) be the set of all ψ such that (1) ψ is the head of an arbitrary instance of r, (2) every positive subgoal in the instance is a member of Δ, and (3) no negative subgoal in the instance is a member of Δ.

Using this notion, we define the result of repeatedly applying a semipositive ruleset Ω to a dataset Δ as follows. Consider the sequence of datasets defined recursively as follows. Γ₀ = Δ, and Γ_n+1 = ∪v(r,Γ₀∪...∪Γ_n) for all r in Ω. The closure of Ω on Δ is the union of the datasets in this sequence, i.e. C(Ω,Δ) = ∪Γ_i.

To illustrate our definition, let's start with a dataset describing a small directed graph. In the sentences below, we use the edge predicate to record the arcs of one particular graph.

Now, let's write some rules defining various relations on the nodes in our graph. Here, the relation p is true of nodes with an outgoing arc. The relation q is true of two nodes if and only if there is an edge from the first to the second or an edge from the second to the first. The relation r is true of two nodes if and only if there is an edge from the first to the second and an edge from the second to the first. The relation s is the transitive closure of the edge relation.

We start the computation by initializing our dataset to the edge facts listed above.

Looking at the p rule and matching its subgoals to the data in our dataset in all possible ways, we see that we can add the following facts. In this case, every node in our graph has an outgoing edge, so there is one p fact for each node.

Looking at the q rules and matching their subgoals to the data in our dataset in all possible ways, we see that we can add the following facts. In this case, we end up with the symmetric closure of the original graph.

Looking at the r rule and matching the subgoals to the data in our dataset in all possible ways, we see that we can add the following facts.

Finally, looking at the first rule for s and matching its subgoals to the data in our dataset in all possible ways, we see that we can add the following facts.

However, we are not quite done. With the facts just added, we can use the second rule to derive the following additional data.

Having done this, we can use the s rule again and can derive the following fact.

At this point, none of the rules when applied to this collection of data produces any results that are not already in the set, and so the process terminates. The resulting collection of 25 facts is the extension of this ruleset applied to this dataset.

3.4 Predefined Concepts

In practical view defnition languages, it is common to predefine useful concepts. These typically include arithmetic functions (such as plus, times, max, min), string functions (such as concatenation), equality and inequality, aggregates (such as countofall), and so forth.

In our language, equality and inequality are expressed using the relations same and distinct. The sentence same(σ,τ) is true iff σ and τ are identical. The sentence distinct(σ,τ) is true if and only if σ and τ are different.

The evaluate relation is used to represent equations involving predefined functions. For example, we would write evaluate(plus(times(3,3),times(2,3),1),16) to represent the equation 3^2+2x3+1=16. If height is a binary predicate relating a figure and its height and if width is a binary predicate relating a figure and its width, we can define the area of the object as shown below. The area of X is A if the height of X is H and the width of X is W and A is the result of multiplying H and W.

In languages that provide such predefined concepts, there are usually syntactic restrictions on their use. For example, if a query contains a subgoal with a comparison relation (such as same and distinct), then every variable that occurs in that subgoal must occur in at least one positive literal in the body and that occurrence must precede the subgoal with the comparison relation. If a query uses evaluate in a subgoal, then any variable that occurs in the first argument of that subgoal must occur in at least one positive literal in the body and that occurrence must precede the subgoal with the arithmetic relation. Details are typically found in the documentation of systems that supply such built-in concepts

In many languages, it is also common to include predefined aggregate operators, such as setofall and countofall.

Aggregate operators are typically represented as relations with special syntax. For example the following rule uses the countofall operator to request the number of a person's children. N is the number of children of X if and only if N is the count of all Y such that X is the parent of Y.

As with special relations, there are syntactic restrictions on their use. In particular, aggregate subgoals must be safe in that all variables in the second argument must be included in the first argument or must be used within positive subgoals of the rule containing the aggregate.

3.5 Example - Kinship

To illustrate the use of rules in defining views, consider once again the world of kinship relations. Starting with some base relations, we can define various interesting view relations.

For example, the first sentence below defines the father relation in terms of parent and male. The second sentence defines mother in terms of parent and female.

The rule below defines the grandparent relation in terms of the parent relation. A person X is the grandparent of a person Z if X is the parent of a person Y and Y is the parent of Z. The variable Y here is a thread variable that connects the first subgoal to the second but does not itself appear in the head of the rule.

Note that the same relation can appear in the head of more than one rule. For example, the person relation is true of a person Y if there is an X such that X is the parent of Y or if Y is the parent of some person Z. Note that in this case the conditions are disjunctive (at least one must be true), whereas the conditions in the grandfather case are conjunctive (both must be true).

A person X is an ancestor of a person Z if X is the parent of Z or if there is a person Y such that X is a parent of Y and Y is an ancestor of Z. This example shows that it is possible for a relation to appear in its own definition. (See below for a discussion of stratification for a restriction on this capability.)

A person is a parent if and only if the person has children, and a childless person is one who has no children. We can define these properties with the rules shown below. The first rule says that isparent is true of X if X is the parent of some person Y. The second rule states that a person X is childless if X is a person and and X has zero children.

Note the use of the relation isparent in defining childless. It is tempting to write the childless rule as childless(X) :- person(X) & ~parent(X,Y). However, this would be wrong. This would define X to be childless if X is a person and there is some Y such that X is ~parent(X,Y) is true. But we really want to say that ~parent(X,Y) holds for all Y. Using the aggregate countofall allows us to express this quantification condition correctly.

3.6 Example - Blocks World

Once again, consider the Blocks World introduced in Chapter 2. The Blocks World scene described earlier is repeated below.

As before, we adopt a vocabulary with five symbols to denote the five blocks in the scene - a, b, c, d, and e. We use the unary predicate block to state that an object is a block. We use the binary predicate on to express the fact that one block is directly on another. We use above to say that a block is somewhere above another block. We use the unary predicate cluttered to a block has other blocks on top of it, and we use the unary predicate clear to say that a block has nothing on top of it. We use the unary predicate supported to say that a block is resting on another block, and we use the unary predicate table to say that a block is resting on the table.

Given this vocabulary, we can describe the scene in Figure 1 by writing ground atomic sentences that state which relations hold of which objects or groups of objects. Let's start with block. There are five blocks in this case, named a, b, c, d, and e.

Some of these blocks are on top of each other, and some are not. The following sentences capture the relationships in Figure 1.

We can do the same for the other relations. However, there is an easier way. Each of the remaining relations can be defined in terms of block and on. These definitions together with our facts about the block and on relations logically entail every other ground relational sentence or its negation. Hence, given these definitions, we do not need to write out any additional data.

A block satisfies the cluttered relation if and only if there is a block resting on it. A block satisfies the clear relation if and only if there is nothing on it.

A block satisfies the supported relation if and only if it is resting on some block. A block satisfies the table relation if and only if it is not resting on some block.

Three blocks satisfy the stack relation if and only if the first is on the second and the second is on the third.

The above relation is a bit tricky to define correctly. One block is above another block if and only if the first block is on the second block or it it is on another block that is above the second block. Given a complete definition for the on relation, these two rules determine a unique above relation.

One advantage to defining relations in terms of other relations is economy. If we record on information for every object and encode the relationship between the on relation and these other relations, there is no need to record any ground relational sentences for those relations.

Another advantage is that these general sentences apply to Blocks World scenes other than the one pictured here. It is possible to create a Blocks World scene in which none of the on sentences we have listed is true, but these general definitions are still correct.

Exercises

Exercise 3.1: Say whether each of the following expressions is a syntactically legal view definition.

Exercise 3.2: What is v(r,Δ) where r is r(X,Y) :- p(X,Y) & p(Y,X) and Δ is the dataset shown below?

Exercise 3.3: What is C(Ω,Δ) where Ω is {r(X,Z) :- p(X,Z), r(X,Z) :- r(X,Y) & r(Y,Z)} and Δ is the dataset shown below?

Exercise 3.4: Two people are siblings if and only if they share a common parent. Write rules defining the binary sibling relation in terms of the parent relation. (Hint: you will need the built-in relation distinct to get the definition of sibling correct.)

Exercise 3.5: Write rules defining the binary uncle relation and the binary aunt relation in terms of parent and male and female.