PROV-DM allows for multiple descriptions of entities (and in general any identifiable object) to be expressed.
Let us consider two statements about the same entity, which we have taken from two different contexts. A working draft published by the w3:Consortium:
entity(tr:WD-prov-dm-20111215, [ prov:type='pr:RecsWD' ])The second version of a document edited by some authors:
entity(tr:WD-prov-dm-20111215, [ prov:type="document", ex:version="2" ])
Both statements are about the same entity identified by tr:WD-prov-dm-20111215, but they contain different attributes, describing the situation or partial state of the these entities according to the context in which they occur.
Two different statements about the same entity cannot co-exist in a PROV instance as formalized in unique-entity.
In some cases, there may be a requirement for two different statements concerning the same entity to be included in the same provenance instance. To satisfy the constraint unique-entity, we can adopt a different identifier for one of them, and relate the two statements with the alternateOf relation.
We now reconsider the same two statements of a same entity, but we change the identifier for one of them:
entity(tr:WD-prov-dm-20111215, [ prov:type='pr:RecsWD' ]) entity(ex:alternate-20111215, [ prov:type="document", ex:version="2" ]) alternateOf(tr:WD-prov-dm-20111215,ex:alternate-20111215)
Taking the union of two accounts is another account, formed by the union of the statements they respectively contain. We note that the resulting union may or may not invalidate some constraints:
How to reconcile such accounts is beyond the scope of this specification.
In the following statements, a workflow execution a0 consists of two sub-workflow executions a1 and a2. Sub-workflow execution a2 generates entity e, so does a0.
activity(a0, [prov:type="workflow execution"]) activity(a1, [prov:type="workflow execution"]) activity(a2, [prov:type="workflow execution"]) wasInformedBy(a2,a1) wasGeneratedBy(e,a0) wasGeneratedBy(e,a2)
So, we have two different generations for entity e. Such an example is permitted in PROV-DM if the two activities denoted by a0 and a2 are a single thing happening in the world but described from different perspectives.
While this example is permitted in PROV-DM, it does not make the inter-relation between activities explicit, and it mixes statements expressed from different perspectives together. While this may acceptable in some specific applications, it becomes challenging for inter-operability. Indeed, PROV-DM does not offer any relation describing the structure of activities. Such instances are said not to be structurally well-formed.
Structurally well-formed provenance can be obtained by partitioning the generations into different accounts. This makes it clear that these generations provide alternative descriptions of the same real-world generation event, rather than describing two distinct generation events for the same entity. When accounts are used, the example can be encoded as follows.
The same example is now revisited, with the following statements that are structurally well-formed. Two accounts are introduced, and there is a single generation for entity e per account.
In a first account, entitled "summary", we find:
activity(a0,t1,t2,[prov:type="workflow execution"]) wasGeneratedBy(e,a0,-)
In a second account, entitled "detail", we find:
activity(a1,t1,t3,[prov:type="workflow execution"]) activity(a2,t3,t2,[prov:type="workflow execution"]) wasInformedBy(a2,a1) wasGeneratedBy(e,a2,-)
Structurally well-formed provenance satisfies some constraints, which force the structure of statements to be exposed by means of accounts. With these constraints satisfied, further inferences can be made about structurally well-formed statements. The uniqueness of generations in accounts is formulated as follows.