Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. PROV-DM is the conceptual data model that forms a basis for the W3C provenance (PROV) family of specifications. PROV-DM distinguishes core structures, forming the essence of provenance information, from extended structures catering for more specific uses of provenance. PROV-DM is organized in six components, respectively dealing with: (1) entities and activities, and the time at which they were created, used, or ended; (2) derivations of entities from entities; (3) agents bearing responsibility for entities that were generated and activities that happened; (4) a notion of bundle, a mechanism to support provenance of provenance; (5) properties to link entities that refer to the same thing; and, (6) collections forming a logical structure for its members.
This document introduces the provenance concepts found in PROV and defines PROV-DM types and relations. The PROV data model is domain-agnostic, but is equipped with extensibility points allowing domain-specific information to be included.
Two further documents complete the specification of PROV-DM. First, a companion document specifies the set of constraints that provenance should follow. Second, a separate document describes a provenance notation for expressing instances of provenance for human consumption; this notation is used in examples in this document.
This is the fifth public release of the PROV-DM document. This is a Last Call Working Draft. The design is not expected to change significantly, going forward, and now is the key time for external review.
This specification identifies one feature at risk: Mention (Section 5.5.3) might be removed from PROV if implementation experience reveals problems with supporting this construct.
The attribute prov:value is an OPTIONAL attribute of entity. The value associated with the attribute prov:value MUST be a PROV-DM Value. The attribute prov:value MAY occur at most once in a set of attribute-value pairs.
The following example illustrates the provenance of the number 4 obtained by an activity that computed the length of an input string "abcd". The input and the output are expressed as entities ex:in and ex:out, respectively. They each have a prov:value attribute associated with the corresponding value.
entity(ex:in, [ prov:value="abcd" ]) entity(ex:out, [ prov:value=4 ]) activity(ex:len, [ prov:type="string-length" ]) used(ex:len, ex:in) wasGeneratedBy(ex:out, ex:len) wasDerivedFrom(ex:out, ex:in)
Two different entities MAY have the same value for the attribute prov:value. For instance, when two entities, with the same prov:value, are generated by two different activities, as illustrated by the following example.
Example REF illustrates an entity with a given value 4. This examples shows that another entity with the same value may be computed differently (by an addition).
entity(ex:in1, [ prov:value=3 ]) entity(ex:in2, [ prov:value=1 ]) entity(ex:out2, [ prov:value=4 ]) // ex:out2 also has value 4 activity(ex:add1, [ prov:type="addition" ]) used(ex:add1, ex:in1) used(ex:add1, ex:in2) wasGeneratedBy(ex:out2, ex:add1)