In this section we give a translation from valid PROV instances to
structures, and show that a valid PROV instance has a model. We
call this property weak completeness.
First, without loss of generality, we assume that all times
specified in activity or event formulas in are ground values.
If not, set each variable in such a position to some dummy value.
This is justified by the following fact:
6.2.1 Sets
The sets of structure are:
In the definitions of , , and
we use the notation to indicate
that must have type in according to the typing
constraints. For example, for entities, this means that the set
contains all identifiers appearing in the ,
, or formulas, as well as all tose
appearing in the appropriate positions of other formulas, as
specified in the typing constraints.
In the definitions of , , , and we
write , , and respectively to indicate
additional activities, generations and usages added for imprecise
derivations or entities.
In addition, to define the set of , we introduce an
equivalence relation on as follows:
The fact that this is an equivalence relation follows from the
fact that is in normal form, since the constraints on
ensure that it is an equivalence relation. Recall
that given an equivalence relation on some set , the
equivalence class of is the set . The quotient of by an equivalence
relation on is the set of equivalence classes, . Now we
define the set of as the quotient of -equivalence classes of .
Observe that since is normalized and valid, entities and
activities are disjoint, the influences are disjoint from entities,
activities, and agents,
and the different subsets of events and influences are pairwise
disjoint, as required.
6.2.2 Functions
First, we consider the functions associated with .
Above, we introduce a fresh attribute name , not already in
use in , along with a fresh value and for each entity we
add a value to . This
construction ensures that if an entity is a specialization of another
in then the specialization relationship will hold in . We
also define the set of all events involved in as the set of events
immediately involved in or any specialization of . Similarly,
the values of attributes of are those immediately declared for
along with those of any that specializes. We also introduce dummy
generation and invalidation events for each entity , along with
activities to perform them.
Similarly, for , we
employ an auxiliary function that collects the set of all
events in which one of the entities constituting the thing participated.
The functions , and mapping activities to
their start and end times are defined as follows:
The start and end times are arbitrary (say, some zero value) for activities with no
formula declaring the times. The above definitions of and
ignore any start times asserted in or
formulas. If both and
statements are present, then they must
match, but PROV-CONSTRAINTS does not require that the times of
multiple start or end events match for an activity with no
statement.
For other besides and , the
associated sets of are defined to be empty. (An that
happens to be an or will have the set of events
defined above for the appropriate kind of object. Note that since
and are disjoint, this definition is unambiguous.)
The function mapping to their is defined
as follows:
This definition is deterministic because the sets of identifiers
of different are disjoint, and the associated times are unique.
The functions giving the interpretations of the
different identified influences are as follows:
Note that since is normalized and valid, by the uniqueness
constraints these functions are all well-defined. In the case for
imprecise derivations, we generate additional activities, generations
and usages linking to .
The definition of the function is more involved, and
is as follows:
This definition ensures that by construction
contains all of the other associated relationships. For any specific
, however, most of the above sets will be empty, and the final
line will often be redundant. It is not always redundant, because it
is possible to assert an unspecified influence in .
It is straightforward to verify (by their definitions) that the
event sets associated with entities and activities satisfy the
side-conditions in Component 9.
Finally, the collection membership function is
defined as follows:
6.2.5 Main results
The main results of this section are that if a valid PROV instance
has a model that
satisfies all of the inferences and constraints. Thus, a form of
completeness holds: every valid PROV instance has a model.
Suppose is a valid PROV instance. Then there exists a PROV structure such that .
First, we consider the case where itself is a valid,
normalized PROV instance , with no existential variables, and let
be the corresponding structure. Then is a PROV structure,
satisfying all of the axioms (and hence all of the inferences and
constraints) stated above.
Moreover, , as can be verified on a
case-by-case basis for each type of formula by considering its
semantics and the definition of the construction of . Most
cases are straightforward; we consider the cases of
and since they are among the most
interesting.
- Suppose . We wish to show that
. Since there are no
existential variables in , we know that . Moreover, according to the
equivalence relation defined above, and so , so we can conclude
that .
- Suppose . We wish to show that
. Again, clearly
, and since satisfies all inferences,
we know that so clearly
as argued above. Next,
because and all that are specializations of are also
specializations of .
Furthermore, for each ,
for the same reason. Finally, by construction and so the
inclusion is strict for the special attribute . Thus, we
have verified all of the conditions necessary to conclude .
Next, we show how to handle a normalized, valid contains
existential variables . Choose fresh constants of
appropriate types for the existential variables and define . Then by the above argument.
Moreover, . So is itself the desired
model.
Finally, to handle the case where is an arbitrary valid instance, we need to show that if is not in normal
form, and normalizes to some such that , then . We can prove
this by induction on the length of the sequence of normalization
steps. The base case, when , is established already.
Suppose normalizes in steps and we can perform one
normalization step on it to obtain , which normalizes to
in steps. By induction, we know that . For
each possible normalization step, we must show that if then .
First consider inference steps. These add information, that
is, . Hence it is immediate that
since every formula in is in , and all formulas of
are satisfied in .
Next consider uniqueness constraint steps, which may involve
merging formulas. That is,
and , where is a unifying substitution making for each . Since ,
we must have for some , and therefore
we must also have that and . We can extend to a valuation such that
where . Also,
and . Moreover, since is a unifier, we also have . Finally, since we can always remove attributes from an
atomic formula without damaging its satisfiability, we can
conclude that . To conclude, we have
shown , that is, , as
desired.