PROV-DM, the PROV data model, is a data model for provenance that describes the entities, people and activities involved in producing a piece of data or thing. PROV-DM is structured in six components, dealing with: (1) entities and activities, and the time at which they were created, used, or ended; (2) agents bearing responsibility for entities that were generated and activities that happened; (3) derivations of entities from entities; (4) properties to link entities that refer to the same thing; (5) collections forming a logical structure for its members; (6) a simple annotation mechanism.
This document introduces the provenance concepts found in PROV and defines PROV-DM types and relations. PROV data model is domain-agnostic, but is equipped with extensibility points allowing domain-specific information to be included.
Two further documents complete the specification of PROV-DM. First, a companion document specifies the set of constraints that provenance descriptions should follow. Second, a separate document describes a provenance notation for expressing instances of provenance for human consumption; this notation is used in examples in this document.
This is the fourth public release of the PROV-DM document. Following feedback, the Working Group has decided to reorganize this document substantially, separating the data model from its contraints and the notation used to illustrate it. The PROV-DM release is synchronized with the release of the PROV-O, PROV-PRIMER, PROV-N, and PROV-CONSTRAINTS documents. We are now clarifying the entry path to the PROV family of specifications.
The following example contains the description of an activity a1 (a discussion), which was started at a specific time, and was triggered by an email message e1.
entity(e1, [prov:type="email message"] ) activity(a1, [ prov:type="Discuss" ]) wasStartedBy(a1, e1, -, 2011-11-16T16:05:00)Furthermore, if the message is also an input to the activity, this can be described as follows:
used(a1, e1, -)
Alternatively, one can also describe the activity that generated the email message.
activity(a0, [ prov:type="Write" ]) wasGeneratedBy(e1, a0) wasStartedBy(a1, e1, a0, 2011-11-16T16:05:00)
In the following example, a race is started by a bang, and responsibility for this trigger is attributed to an agent ex:Bob.
activity(ex:foot_race) wasStartedBy(ex:foot_race,ex:bang,2012-03-09T08:05:08-05:00) entity(ex:bang) agent(ex:Bob) wasAttributedTo(ex:bang,ex:Bob)
In this example, filling fuel was started as a consequence of observing the low fuel. The trigger entity is unspecified, it could for instance have been the low fuel warning light, the fuel tank indicator needle position, or the engine not running properly.
activity(ex:filling-fuel) activity(ex:observing-low-fuel) agent(ex:driver, [ prov:type="prov:Person" %% xsd:QName ) wasAssociatedWith(ex:filling-fuel, ex:driver) wasAssociatedWith(ex:observing-low-fuel, ex:driver) wasStartedBy(ex:filling-fuel, -, ex:observing-low-fuel, -)
The relations wasStartedBy and used are orthogonal, and thus need to be expressed independently, according to the situation being described.
WG membership to be listed here.