The PROV Ontology (also PROV-O) encodes the PROV Data Model [[!PROV-DM]] in the OWL2 Web Ontology Language (OWL2). The PROV ontology consists of a set of classes, properties, and restrictions that can be used to represent provenance information. The PROV ontology is specialized to create domain-specific provenance ontologies that model the provenance information specific to different applications. The PROV ontology supports a set of entailments based on OWL2 formal semantics and provenance specific inference rules. The PROV ontology is available for download as a separate OWL2 document.

Introduction

PROV Ontology (also PROV-O) defines the normative modeling of the PROV Data Model [[PROV-DM]] using the W3C OWL2 Web Ontology Language. This document specification describes the set of classes, properties, and restrictions that constitute the PROV ontology, which have been introduced in the PROV Data Model [[PROV-DM]]. This ontology specification provides the foundation for implementation of provenance applications in different domains using the PROV ontology for representing, exchanging, and integrating provenance information. Together with the PROV Access and Query [[PROV-PAQ]] and PROV Data Model [[PROV-DM]], this document forms a framework for provenance information interchange and management in domain-specific Web-based applications.

The PROV ontology classes and properties are defined such that they can not only be used directly to represent provenance information, but also can be specialized for modeling application-specific provenance details in a variety of domains. Thus, the PROV ontology is expected to be both directly usable in applications as well as serve as a reference model for creation of domain-specific provenance ontology and thereby facilitate interoperable provenance modeling. This document uses an example provenance scenario introduced in the PROV Data Model [[PROV-DM]] to demonstrate the use PROV-O classes and properties to model provenance information.

Finally, this document describes the formal semantics of the PROV ontology using the OWL2 semantics, [[!OWL2-DIRECT-SEMANTICS]], [[!OWL2-RDF-BASED-SEMANTICS]], and a set of provenance-specific inference rules. This is expected to support provenance implementations to automatically check for consistency of provenance information represented using PROV ontology and explicitly assert implicit provenance knowledge.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [[!RFC2119]].

Guide to this Document

This document is intended for provide an understanding of the PROV ontology and how it can be used by different applications to represent their provenance information. The intended audience of this document include users who are new to provenance modeling as well as experienced users who would like their provenance model compatible with the PROV ontology to facilitate standardization. This document assumes a basic understanding of the W3C RDF(S) and OWL2 specification. Readers are referred to the OWL2 and RDF(S) documentations, starting with the [[!OWL2-PRIMER]] and [[!RDF-PRIMER]], for further details about the OWL2 and RDF(S) specifications respectively.

Section 2 describes the mapping of the PROV Data Model [[PROV-DM]] to the PROV ontology. Section 3 introduces the classes and properties of the PROV ontology. Section 4 describes the approach used to specialize the PROV ontology create a domain specific ontology for an example provenance scenario introduced in the PROV Data Model [[PROV-DM]]. The PROV ontology supports a set of provenance entailments and these are described in Section 5.

PROV Ontology

The PROV Data Model [[PROV-DM]] introduces a minimal set of concepts to represent provenance information in a variety of application domains. This document maps the PROV Data Model to PROV Ontology using the OWL2 ontology language, which facilitates a fixed interpretation and use of the PROV Data Model concepts based on the formal semantics of OWL2 [[!OWL2-DIRECT-SEMANTICS]] [[!OWL2-RDF-BASED-SEMANTICS]].

The PROV Ontology can be used directly in a domain application, though many domain applications may require specialization of PROV-O Classes and Properties for representing domain-specific provenance information. We briefly introduce some of the OWL2 modeling terms that will be used to describe the PROV ontology. An OWL2 instance is an individual object in a domain of discourse, for example a person named Alice or a car, and a set of individuals sharing a set of common characteristics is called a class. Person and Car are examples of classes representing the set of individual persons and cars respectively. The OWL2 object properties are used to link individuals, classes, or create a property hierarchy. For example, the object property "hasOwner" can be used to link car with person. The OWL2 datatype properties are used to link individuals or classes to data values, including XML Schema datatypes [[!XMLSCHEMA-2]].

The PROV Data Model document [[PROV-DM]] introduces an example provenance scenario describing the creation of crime statistics file stored on a shared file system and edited by journalists Alice, Bob, Charles, David, and Edith. This scenario is used as a running example in this document to describe the PROV ontology classes and properties, the specialization mechanism, and the entailments supported by the PROV ontology.

Mapping the PROV-DM terms to PROV Ontology

The PROV Data Model [[PROV-DM]] uses an Abstract Syntax Notation (ASN) to describe the set of provenance terms that are used to construct the PROV ontology. There are a number of differences between the PROV-DM ASN and the Semantic Web RDF, RDFS and OWL2 technologies; hence the approach used to model the provenance terms in PROV ontology differ, partially or significantly, from the PROV-DM approach.

For example, the notion of "expressions" used in the PROV-DM map to RDF triple assertions in PROV-O. Similarly, the PROV-DM discusses the use of "Qualifier" to assert additional information about provenance terms. Following the general knowledge representation practices and OWL2 ontologies specifically, the PROV ontology specializes a given provenance term to create either a sub class or sub property to represent "additionally" qualified terms. Throughout this document, we explicitly state the difference, if any, between the PROV-DM term and PROV ontology term.

In addition, RDF is strictly monotonic and "...it cannot express closed-world assumptions, local default preferences, and several other commonly used non-monotonic constructs."[[RDF-MT]], but the PROV-DM seems to introduce the notion of non-monotonic assertions through "Account" construct [[PROV-DM]]. For example, Account description in PROV-DM states that it "It provides a scoping mechanism for expression identifiers and for some contraints (such as generation-unicity and derivation-use)."

OWL2 Syntax Used in this Document

This document uses the RDF/XML syntax, which is the mandatory syntax supported by all OWL2 syntax [[!OWL2-PRIMER]] to represent the PROV ontology. Provenance assertions using PROV-O can use any of the RDF syntax defined in the RDF specification [[!RDF-PRIMER]].

Namespace and OWL2 version

The corresponding OWL2 version of this PROV Ontology is available at [[PROV-Ontology-Namespace]] and as ProvenanceOntology.owl. The namespace for the PROV ontology and all terms defined in this document is http://www.w3.org/ns/prov-o/ [[PROV-Ontology-Namespace]] and is in this document denoted by the prefix prov.

It has been suggested that [[PROV-DM]] and PROV-O should instead use the namespace http://www.w3.org/ns/prov/ for terms that are common in both models. This is ISSUE-90

PROV Ontology: Classes and Properties

We now introduce the classes and properties that constitute the PROV ontology. We first give a textual description of each ontology term, followed by OWL2 syntax representing the ontology term and an example use of the class in the provenance scenario.

Classes

The PROV ontology consists of classes that can be organized into a hierarchical structure using the rdfs:subClassOf property. Class hierarchy of the PROV ontology

Note: CamelBack notation is used for class names

Entity

Class Description

Entity is defined to be "An Entity represents an identifiable characterized thing." [[PROV-DM]]

OWL syntax
 prov:Entity rdfs:subClassOf owl:Thing. 
		  
Example

Example of instances of class Entity from the provenance scenario are files with identifiers e1 and e2. The RDF/XML syntax for asserting that e1 is an instance of Entity is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#e1">
				    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Entity"/>
				</rdf:Description>
			

Additional assertions can be made about the Entity instances that describe additional attributes of the entities. Following common knowledge representation approach, the Entity class can be specialized to create multiple sub classes, using the rdfs:subClassOf property, representing distinct categories of entities using additional characterizing attributes (as defined in the [[PROV-DM]]). The additional attributes SHOULD use an appropriate namespace, and the new sub classes MAY be introduced by application-specific provenance ontologies.

Example
                <rdf:Description rdf:about="http://www.example.com/crimeFile#e2">                  
                  <rdf:type rdf:resource="http://www.example.com/crime#CrimeFile">           
                </rdf:Description>
<rdf:Description rdf:about="http://www.example.com/crime#CrimeFile">
  <rdfs:subClassOf rdf:resource="http://www.w3.org/ns/prov-o/Entity"/>
</rdf:Description>
			

ProcessExecution

Class Description

ProcessExecution is defined to be "an identifiable activity, which performs a piece of work." [[PROV-DM]]

OWL syntax
prov:ProcessExecution rdfs:subClassOf owl:Thing.
Example

Example instances of the class ProcessExecution (from the provenance scenario ) are "file creation" (pe0) and "file editing" (pe2) . The RDF/XML syntax for asserting that pe2 is an instance of ProcessExecution is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#pe2">
				    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/ProcessExecution"/>
				</rdf:Description>
			
pe2 is an instance of class :Emailing, which is defined to be sub-class of class prov:ProcessExecution in the CrimeFile ontology. Hence, using standard RDFS entailment allows us to infer that pe2 is also an instance of prov:ProcessExecution.

Agent

Class Description

Agent is defined to be a "characterized entity capable of activity" [[PROV-DM]]

OWL syntax
prov:Agent rdfs:subClassOf prov:Entity.
Example

Example of instances of class Agent from the provenance scenario are Alice and Edith. The RDF/XML syntax for asserting that Alice is an instance of Agent is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#Alice">
				    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Agent"/>
				</rdf:Description>
			
Similar to example for Entity, both Alice and Edith are instances of class Journalist, which is defined to be "sub-class" of class Agent in the CrimeFile ontology. Hence, using standard RDFS entailment allows us to infer that both Alice and Edith are also instances of Agent.

TemporalEntity

Class Description

TemporalEntity represents temporal information about entities in the Provenance model. This class has been re-used from the OWL Time ontology [[!OWL-TIME]]. The PROV ontology also models the two sub classes of TemporalEntity, namely Instant and Interval.

The Instant class represents "point-line" temporal information that have "no interior points" [[!OWL-TIME]]. The Interval class represents temporal information that have a non-zero duration [[!OWL-TIME]]

OWL syntax
time:TemporalEntity rdfs:subClassOf owl:Thing.
Example

Example of instances of class TemporalEntity from the provenance scenario are t and t+1. t+1 is associated with the instance of ProcessExecution pe2. The instances of TemporalEntity are linked to instances of Entity or ProcessExecution classes by the hadTemporalValue property that is described later in this document.

The RDF/XML syntax for this asserting that t+1 is an instance of class TemporalEntity and t+1 is associated with pe2 is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#pe2">
				  <prov:hadTemporalValue rdf:about="http://www.example.com/crimeFile#t+1">
				    <rdf:type rdf:resource="http://www.w3.org/2006/time#TemporalEntity"/>
				  </prov:hadTemporalValue>
				</rdf:Description>
			

ProvenanceContainer

Class Description

ProvenanceContainer is defined to be an aggregation of provenance assertions. A provenance container should have an URI associated with it. The ProvenanceContainer class can also be used to model the PROV-DM concept of Account.

OWL syntax
prov:ProvenanceContainer rdfs:subClassOf owl:Thing.

Examples of instance of class ProvenanceContainer includes a RDF graph containing set of assertions describing the provenance of a car, such as its manufacturer, date of manufacture, and place of manufacture.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#ProvenanceContainer1">
				    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/#ProvenanceContainer"/>
				    <cf:contains rdf:resource="http://www.example.com/crimeFile#Statement1"/>
				    <cf:contains rdf:resource="http://www.example.com/crimeFile#Statement2"/>
				    <cf:assertedBy rdf:resource="http://www.example.com/crimeFile#Alice"/>
				</rdf:Description>				
			
According to the definitions of ProvenanceContainer and Account, both contain a set of provenance assertions and have an identifier. Hence, ProvenanceContainer class can also be used to create instances of accounts.
Scope and Identifiers. This is ISSUE-81.
Modeling ProvenanceContainer and Account as RDF Graph

If a RDF graph contains a set of RDF assertions then, (a) if an explicit asserter is associated with the RDF graph it corresponds to the term "Account" in PROV-DM, and (b) if an asserted is not associated with the RDF graph it corresponds to the term "ProvenanceContainer" in PROV-DM.

Location

Class Description

Location is defined to be "is an identifiable geographic place (ISO 19112)." [[PROV-DM]]

OWL syntax
prov:Location rdfs:subClassOf owl:Thing.

Example of instances of class Location from the provenance scenario is the location of the crime file in the shared directory /share with file path /shared/crime.txt. The RDF/XML syntax for asserting that the location of the crime file is the shared directory.

            <cf:hasLocation>
                <rdf:Description rdf:about="http://www.example.com/crimeFile#sharedDirectoryLocation1">
                    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Location"/>
                    <cf:hasFilePath rdf:datatype="http://www.w3.org/2001/XMLSchema#string">/share/crime.txt</cf:hasFilePath>
                </rdf:Description>
            </cf:hasLocation>
		  
Need to clarify whether "geographic" includes "geospatial"?

QualifiedInvolvement

Class Description

The QualifiedInvolvement class represents an n-ary property to capture qualifying information related to the use of Entity by ProcessExecution.

OWL syntax
prov:QualifiedInvolvement rdfs:subClassOf owl:Thing.

Usage

Class Description

The Usage class represents an n-ary property to capture qualifying information related to the the use, generation, control, and participation.

OWL syntax
prov:Usage rdfs:subClassOf prov:QualifiedInvolvement.
Example

Example of instances of class Usage from the provenance scenario provenance scenario ??? is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#u1">
				    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Usage"/>
				    <prov:hadQualifiedEntity rdf:resource="http://www.example.com/crimeFile#Bob"/>				    
				</rdf:Description>
			  

Participation

Class Description

The Participation class represents an n-ary property to capture qualifying information related to the participation of Entity in ProcessExecution.

OWL syntax
prov:Participation rdfs:subClassOf prov:QualifiedInvolvement.
Example

Example of instances of class Participation from the provenance scenario provenance scenario ??? is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#p1">
				    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Participation"/>
				    <prov:hadQualifiedEntity rdf:resource="http://www.example.com/crimeFile#Bob"/>				    
				</rdf:Description>
			  

Control

Class Description

The Control class represents an n-ary property to capture qualifying information related to the control of ProcessExecution by Agent.

OWL syntax
prov:Control rdfs:subClassOf prov:QualifiedInvolvement.
Example

Example of instances of class Control from the provenance scenario provenance scenario ??? is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#c1">
				    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Control"/>
				    <prov:hadQualifiedEntity rdf:resource="http://www.example.com/crimeFile#Bob"/>				    
				</rdf:Description>
			    

Generation

Class Description

The Generation class represents an n-ary property to capture qualifying information related to the generation of Entity by ProcessExecution.

OWL syntax
prov:Generation rdfs:subClassOf prov:QualifiedInvolvement.
Example

Example of instances of class Generation from the provenance scenario ??? is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#g1">
					    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Generation"/>
					    <prov:hadQualifiedEntity rdf:resource="http://www.example.com/crimeFile#Bob"/>				    
					</rdf:Description>
				  

Recipe

Class Description

Recipe represents the specification of a ProcessExecution. PROV ontology does not define the different types of recipes that can be created by provenance applications in different domains.

OWL syntax
prov:Recipe rdfs:subClassOf owl:Thing.
Example

An example of recipe from the provenance scenario may be the editing protocol followed by the journalists to edit a news report.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#news_editing">
					    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/ProcessExection"/>
					    <prov:hadRecipe rdf:resource="http://www.example.com/crimeFile#NewsReportEditingProtocol"/>				    
					</rdf:Description>
				

Role

Class Description

Role class models additional information about Entity or ProcessExecution class with respect to the QualifiedInvolvement class [[PROV-DM]]

OWL syntax
prov:Role rdfs:subClassOf owl:Thing.
Example

Example of instances of class Role from the provenance scenario are author (for Alice) and save (for pe1). The RDF/XML syntax for asserting that Alice played a role of author in the usage u1 (instance of class Usage) of file e1 in the activity of adding content.

			<rdf:Description rdf:about="http://www.example.com/crimeFile#u1">
			    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Usage"/>
			    <prov:hadRole rdf:resource="www.example.com/crimeFile#author"/>
			</rdf:Description>
		
It is not clear how two roles can be modeled using the QualifiedInvolvement class-based approach, where an Entity plays a role of "author", while the ProcessExecution plays the role of "save" (from the provenance scenario ).

Temporary Section for Classes

Temporary section for terms not part of "core" ontology.

Time

Class Description

Time is subclass of time:Instant from [[!OWL-TIME]] which requires that the time is defined using the time:inXSDDateTime property. This class used with startedAt and other subproperties of hasTemporalValue ensures compatibility with xsd:dateTime literals expressions in [[PROV-DM]] ASN and other serialisations. c

Object Properties

The PROV ontology has the following object properties.

Note: Names of properties starts with a verb in lower case followed by verb(s) starting with upper case

wasGeneratedBy

The wasGeneratedBy property links the Entity class with the ProcessExecution class.

Note: No arity constraints are assumed between Entity and ProcessExecution

wasGeneratedBy links Entity to ProcessExecution
Example

Example of wasGeneratedBy property from the provenance scenario is e1 wasGeneratedBy pe0. The RDF/XML syntax for asserting this information is given below.

                <rdf:Description rdf:about="http://www.example.com/crimeFile#e1">
                    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Entity"/>
                    <prov:wasGeneratedBy>
                        <rdf:Description rdf:about="http://www.example.com/crimeFile#pe0">
                            <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/ProcessExecution"/>
                        </rdf:Description>
                    <prov:wasGeneratedBy>
                </rdf:Description>    
			

wasRevisionOf

The wasRevisionOf property links two instances of Entity class, where one instance is a revision of another instance, and there is explicit role of an Agent in asserting this information.

Example

Example of wasRevisionOf property from the provenance scenario is e3 wasRevisionOf e2. The RDF/XML syntax for asserting this information is given below.

                <rdf:Description rdf:about="http://www.example.com/crimeFile#e3">
                    <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Entity"/>
                    <prov:wasRevisionOf>
                        <rdf:Description rdf:about="http://www.example.com/crimeFile#e2">
                            <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Entity"/>
                        </rdf:Description>
                    <prov:wasRevisionOf>
                </rdf:Description>    
			
Can instance of Agents be reasoning agents that infer the information that one Entity instance is a revision of another Entity instance and then asserts the information? In other words, is assertion after inference supported by this property?

wasDerivedFrom

The wasDerivedFrom property links two instances of Entity class, where "some characterized entity is transformed from, created from, or affected by another characterized entity." [[PROV-DM]]

wasDerivedFrom links Entity to Entity
Example

Example of wasDerivedFrom property from the provenance scenario is e3 wasDerivedFrom e2. The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#e3">
					    <prov:wasDerivedFrom rdf:resource="http://www.example.com/crimeFile#e2"/>
					</rdf:Description>	
				
Should derivation have a time? Which time? This is ISSUE-43.
Should we specifically mention derivation of agents? This is ISSUE-42.

wasEventuallyDerivedFrom

This object property is used to link two instances of Entity class that "...are not directly used and generated respectively" by a single instance of ProcessExecution class [[PROV-DM]].

wasEventuallyDerivedFrom links Entity to Entity
Example

Example of wasEventuallyDerivedFrom property from the provenance scenario is e5 wasEventuallyDerivedFrom e2. The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#e5">
					    <prov:wasEventuallyDerivedFrom rdf:resource="http://www.example.com/crimeFile#e2"/>
					</rdf:Description>	
				
Is the current definition of wasEventuallyDerivedFrom inconsistent with definition of wasDerivedFrom? This is ISSUE-122 and ISSUE-126

dependedOn

The dependedOn property links two instances of Entity class to model the derivation of one instance from another instance. This is a transitive property, in other words if an Entity instance a1 dependedOn a2 and a2 dependedOn a3, then a1 dependedOn a3 is also true.

dependedOn links Entity to Entity
Example

Example of dependedOn property from the provenance scenario is e5 dependedOn e2. The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#e5">
					    <prov:dependedOn rdf:resource="http://www.example.com/crimeFile#e2"/>
					</rdf:Description>	
				
Is dependedOn a parent property of wasDerivedFrom? This is ISSUE-125

used

The used property links the ProcessExecution class to the Entity class, where the Entity instance is "consumed" by a ProcessExecution instance.

Note: No arity constraints are assumed between Entity and ProcessExecution

used links ProcessExecution to Entity
Example

Example of used property from the provenance scenario is pe2 used e2. The RDF/XML syntax for asserting this is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#pe2">
				  	<prov:used rdf:resource="http://www.example.com/crimeFile#e2"/>
				  </rdf:Description>	
			

hadParticipant

The hadPariticipant property links Entity class to ProcessExecution class, where Entity used or wasGeneratedBy ProcessExecution.

Note: No arity constraints are assumed between Entity and ProcessExecution

hadParticipant links ProcessExecution to Entity
Example

Example of hadParticipant property from the provenance scenario is pe2 hadParticipant e2. The RDF/XML syntax for asserting this is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#pe2">
				  	<prov:hadParticipant rdf:resource="http://www.example.com/crimeFile#e2"/>
				</rdf:Description>	
			
Suggested definition for participation. This is ISSUE-49.
The current definition of hasParticipant does not account for involvement of an Entity in ProcessExecution where it was neither "used" or "generated". For example, a witness in a criminal activity.

wasComplementOf

The wasComplementOf property links two instances of set of assertions about Entity instances, where "it is relationship between two characterized entities asserted to have compatible characterization over some continuous time interval." [[PROV-DM]]

wasComplementOf links Entity to Entity
Should the wasComplementOf property link two instances of ProvenanceContainer (or Account) classes since they are two classes modeling a set of (one or more) provenance assertions?

wasControlledBy

The wasControlledBy property links ProcessExecution class to Agent class, where control represents the involvement of the Agent in modifying the characteristics of the instance of the ProcessExecution class"[[PROV-DM]].

wasControlledBy links ProcessExecution to Agent
Example

Example of wasControlledBy property from the provenance scenario is FileAppending (ProcessExecution) wasControlledBy Bob. The RDF/XML syntax for asserting this is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">
				  <prov:wasControlledBy>
				    <rdf:Description rdf:about="http://www.example.com/crimeFile#Bob">
				      <rdf:type rdf:resource="http://www.example.com/crime#Journalist"/>
				    </rdf:Description>
				  </prov:wasControlledBy>
				</rdf:Description>	
			

hadRecipe

This property links the ProcessExecution class to the Recipe class, which describes the execution characteristics of the instance of the ProcessExecution class. The recipe might or might not have been followed exactly by the ProcessExecution.

hadRecipe links ProcessExecution to Agent
Example

Example of hadRecipe property in the (extended) provenance scenario is that pe1 (instance of ProcessExecution class) followed some file appending instructions (instructions1). The RDF/XML syntax for asserting this is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">
				  	<prov:hadRecipe rdf:resource="http://www.example.com/crimeFile#instructions1"/>
				</rdf:Description>	
			

wasInformedBy

This object property links two instances of the ProcessExecution classes. It is used to express the information that a given process execution used an entity that was generated by another process execution.

wasInformedBy links ProcessExecution to ProcessExecution
Example

Example of wasInformedBy property from the provenance scenario is pe4 wasInformedBy pe3. The RDF/XML syntax for asserting this is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#pe4">
				  	<prov:wasInformedBy rdf:resource="http://www.example.com/crimeFile#pe3"/>
				</rdf:Description>	
			

wasScheduledAfter

This property links two instances of ProcessExecution class to specify the order of their executions. Specifically, it is used to specify that a given process execution starts after the end of another process execution.

wasScheduledAfter links ProcessExecution to ProcessExecution
Example

Example of wasScheduledAfter property from the provenance scenario is pe4 wasScheduledAfter pe3. The RDF/XML syntax for asserting this is given below.

				<rdf:Description rdf:about="http://www.example.com/crimeFile#pe4">
				  	<prov:wasScheduledAfter rdf:resource="http://www.example.com/crimeFile#pe3"/>
				</rdf:Description>	
			

hadTemporalValue

This object property links an instance of ProcessExecution or Entity with an time:TemporalEntity from [[!OWL-TIME]], thereby allowing association of time value with instances of the two classes and their subclasses.

hadTemporalValue links ProcessExecution or Entity to time:TemporalValue
Example

Example of hadTemporalValue property from the provenance scenario is t+3 time value is associated with the pe3 ProcessExecution instanc. The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#pe3">
					  	<prov:hasTemporalValue rdf:resource="http://www.example.com/crimeFile#t+3"/>
					</rdf:Description>	
				

wasAttributedTo

The wasAttributedTo property links an instance of the Entity class to an instance of Agent class.

wasAttributedTo links Entity to Agent
Example

Example of wasAttributedTo property as an addition to the provenance scenario is the attribution of e3 to David for writting editing the file (e3 wasAttributedTo David). The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#e3">
						<prov:wasAttributedTo rdf:resource="http://www.example.com/crimeFile#David"/>
					</rdf:Description>	
				

wasQuoteOf

The wasQuoteOf property links an instance of the Entity class to an instance of the Agent class.

wasQuoteOf links Entity to Agent
Example

Example of wasQuoteOf property as an addition to the provenance scenario is e2 quoting Alice, recorded by Bob ( e2 wasQuoteOf Alice). The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#e2">
						<prov:wasQuoteOf rdf:resource="http://www.example.com/crimeFile#Alice"/>
					</rdf:Description>	
				

wasSummaryOf

The wasSummaryOf property links two instances of the Entity class.

wasSummaryOf links Entity to Entity
Example

Example of wasSummaryOf property as an addition to the provenance scenario is e3 summarizing some additional statistics( e3 wasSummaryOf statistics). The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#e3">
						<prov:wasSummaryOf rdf:resource="http://www.example.com/crimeFile#statistics"/>
					</rdf:Description>	
				

hadOriginalSource

The hadOriginalSource property links two instances of the Entity class. This property is defined to be a specialization of the wasEventuallyDerivedFrom propery.

hadOriginalSource links Entity to Entity
Example

Example of hadOriginalSource property from the provenance scenario e6 hadOriginalSource e1. The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#e6">
						<prov:hadOriginalSource rdf:resource="http://www.example.com/crimeFile#e1"/>
					</rdf:Description>	
				

hadQualifiedUsage

The hadQualifiedUsage property links the ProcessExecution class with the Usage class.

Example

Example of hadQualifiedUsage property from the provenance scenario pe1 hadQualifiedUsage u1, where the hadRole describes the usage of e1 as a "load". The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">
						<prov:hadQualifiedUsage rdf:resource="http://www.example.com/Usage#u1"/>
					</rdf:Description>	
				

hadQualifiedParticipation

The hadQualifiedParticipation property links the ProcessExecution class with the Participation class.

Example

Example of hadQualifiedParticipation property from the provenance scenario pe1 hadQualifiedParticipation p1, where the hadRole describes the participation of Alice as an "author" in pe1. The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">
						<prov:hadQualifiedParticipation rdf:resource="http://www.example.com/crimeFile#p1"/>
					</rdf:Description>	
				

hadQualifiedControl

The hadQualifiedControl property links the ProcessExecution class with the Control class.

Example

Example of hadQualifiedControl property from the provenance scenario pe0 hadQualifiedControl c1, where the hadRole describes the control of pe0 by Alice as "creator". The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#pe0">
						<prov:hadQualifiedControl rdf:resource="http://www.example.com/crimeFile#c1"/>
					</rdf:Description>	
				

hadQualifiedGeneration

The hadQualifiedGeneration property links the ProcessExecution class with the Generation class.

Example

Example of hadQualifiedGeneration property from the provenance scenario e1 hadQualifiedGeneration g1, where the hadRole describes the generation of e1 by "save". The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#e1">
						<prov:hadQualifiedGeneration rdf:resource="http://www.example.com/crimeFile#u1"/>
					</rdf:Description>	
				

hadQualifiedEntity

The hadQualifiedEntity property links the QualifiedInvolvement class with the Entity class.

Example

Example of hadQualifiedEntity property from the provenance scenario u2 hadQualifiedEntity e2, where the hadRole describes the usage of e2 as an "attachment". The RDF/XML syntax for asserting this is given below.

					<rdf:Description rdf:about="http://www.example.com/crimeFile#u2">
						<prov:hadQualifiedEntity rdf:resource="http://www.example.com/crimeFile#e2"/>
					</rdf:Description>	
				

Holding Section for Properties

Temporary section for terms not part of "core" ontology.

followed

The followed links two instances of ProcessExecution to model ordering of the ProcessExecution instances.

followed links ProcessExecution to ProcessExecution
Example

Example of followed property from the provenance scenario is pe4 followed pe1. The RDF/XML syntax for asserting this is given below.

							<rdf:Description rdf:about="http://www.example.com/crimeFile#pe4">
								<prov:wasScheduledAfter rdf:resource="http://www.example.com/crimeFile#pe1"/>
							</rdf:Description>	
						

hadTemporalValue

This property can be considered an abstract property, specialised by startedAt, endedAt, wasGeneratedAt and assumedRoleAt where the time MUST be specified as an time:Instant or more specifically MAY be specified using the PROV-O subclass Time which mandates the use of the time:inXSDDateTime data property.

wasAssumedBy

, forming a placeholder EntityInRole for use in relations such as used and wasGeneratedBy. wasAssumedBy is a required, functional property of EntityInRole, so an EntityInRole is assumed by one and only one Entity. wasAssumedBy is a subproperty of wasComplementOf.

startedAt

This object property defines the time when a ProcessExecution started. The time is specified as an time:Instant [[!OWL-TIME]], which MAY be a Time subclass by specifying the time using a time:inXSDDateTime data property.

startedAt links ProcessExecution to Instant
		                    <rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">
		                      <prov:startedAt rdf:parseType="Resource">
		                          <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Time"/>
		                          <time:inXSDDateTime>2011-10-20T16:26:45Z</time:inXSDDateTime>
		                      </prov:startedAt>
		                    </rdf:Description>	
		                

endedAt

This object property defines the time when a ProcessExecution ended. The time is specified as an time:Instant [[!OWL-TIME]], which MAY be a Time subclass by specifying the time using a time:inXSDDateTime data property.

endedAt links ProcessExecution to Instant
		                    <rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">
		                      <prov:endedAt rdf:parseType="Resource">
		                          <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Time"/>
		                          <time:inXSDDateTime>2011-11-21T18:36:52Z</time:inXSDDateTime>
		                      </prov:endedAt>
		                    </rdf:Description>	
		                

wasGeneratedAt

This object property defines the time when a Entity was generated (as specified using wasGeneratedBy), meaning the instant when the entity first existed (and could be used by other process executions). The time is specified as an time:Instant [[!OWL-TIME]], which MAY be a Time subclass by specifying the time using a time:inXSDDateTime data property.

wasGeneratedAt links Entity to Instant

Note that by constraint

		                    <rdf:Description rdf:about="http://www.example.com/crimeFile#e2">
		                      <prov:wasGeneratedAt rdf:parseType="Resource">
		                          <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Time"/>
		                          <time:inXSDDateTime>2011-10-20T17:14:12Z</time:inXSDDateTime>
		                      </prov:wasGeneratedAt>
		                    </rdf:Description>	
		                

assumedRole

This object property defines which Role has been assumed in an EntityInRole. This property is applied in relations such as used and wasGeneratedBy.

assumedRole links EntityInRole to Role

The definition and interpretation of the Role is outside the scope for PROV-O. The Role class is a placeholder that can be extended and specialized.

		                    <rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">
		                      <prov:used>
		                        <rdf:Description rdf:about="http://www.example.com/crimeFile#BobAsAuthor">
		                            <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/EntityInRole"/>
		                            <prov:wasAssumedBy rdf:resource="http://www.example.com/crimeFile#Bob"/>
		                            <prov:assumedRole rdf:resource="http://www.example.com/crime#author"/>
		                        <rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">
		                      <prov:used>
		                    </rdf:Description>	
		                

The example above corresponds to the PROV-ASN assertion: used(pe1, bob, qualifier(role="author")

It has been suggested that roles should be represented as classes, allowing hierarchies and composition of roles. OWL2 punning would allow both :entityInRole rdf:type :ExampleRole and :entityInRole prov:assumedRole :ExampleRole.

assumedRoleAt

This object property defines the first time an Entity assumed a role, ie. when the EntityInRole which wasAssumedBy was active. This is intended to be used together with a used statement to define the instant when an entity was first used. The time is specified as an time:Instant [[!OWL-TIME]], which MAY be a Time subclass by specifying the time using a time:inXSDDateTime data property.

assumedRoleAt links EntityInRole to Instant

According to the constraint generation-unicity from [[PROV-DM]] an entity can only be generated once by a single process execution. This ontology further assumes that all assertions about that generation must have the same start time.

		                    <rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">
		                      <prov:used rdf:parseType="Resource">
		                          <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/EntityInRole"/>
		                          <prov:assumedRoleAt rdf:parseType="Resource">
		                            <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Time"/>
		                            <time:inXSDDateTime>2011-10-20T17:14:12Z</time:inXSDDateTime>
		                         </prov:assumedRoleAt>
		                      </prov:used>
		                    </rdf:Description>	
		                

Characteristics of Object Properties

The table below summarizes the characteristics of the object properties that are defined in the OWL schema.

Functional Reverse functional Transitive Symmetric Asymmetric Reflexive Irreflexive
wasControlledBy No No ? No Yes No Yes
wasDerivedFrom No No Yes No Yes No Yes
hadParticipant No No ? No Yes No Yes
wasGeneratedBy Yes No ? No Yes No Yes
used No No ? No Yes No Yes
wasInformedBy No No No No No No No
wasScheduledAfter No No Yes No Yes No Yes
Some of them may be subject to discussion. In particular, regarding the object properties wasControlledBy, wasGeneratedBy and used, we did not specify whether they are transitive or not. One may argue that given that an agent can be a process execution, a process execution, e.g., pe1, can be controlled by an agent pe2, which happens to be a process execution that is controlled by an agent ag, and that, therefore, ag (indirectly) controls pe1. The same argument can be applied to wasGeneratedBy and used. That said, we are not convinced that these properties should be declared as transitive. In fact, we are more inclined towards specifying that they are not.

Annotation Properties

The PROV ontology uses the OWL2 annotation properties to describe additional information about the PROV ontology classes, properties, individuals, and axioms. OWL2 defines nine annotation properties that are part of the OWL2 structural specification (see OWL2 Syntax document for additional details [[!OWL2-SYNTAX]]):

Additional annotation properties can be defined by provenance ontologies, but unlike the OWL2 annotation properties, these custom annotation properties may not be interpreted in a standard manner across different provenance applications.

Is there a need to define standard provenance-specific annotation properties?

Overview of the ontology

The following diagram illustrates the complete PROV ontology.

Classes and properties of the PROV ontology

Specializing Provenance Ontology for Domain-specific Provenance Applications

The PROV Ontology is conceived as a reference ontology that can be extended by various domain-specific applications to model the required set of provenance terms. The PROv Ontology classes and properties can be specialized using the following two RDFS properties:

To illustrate the specialization mechanism, the PROV Ontology is extended to create an ontology schema for the provenance scenario describing the creation of the crime statistics file.

Modeling the Crime File Scenario

The example scenario can be encoded as a Resource Description Framework (RDF). For example,

Example given below describes the provenance of Entity e2 using RDF/XML syntax

                <?xml version="1.0"?>
                <rdf:RDF
                    xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
                    xmlns:prov="http://www.w3.org/ns/prov-o/"
                    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
                    xmlns:cf="http://www.example.com/crime#"> 

                    <rdf:Description rdf:about="http://www.example.com/crimeFile#e2">
                      <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Entity"/>
                      <rdf:type rdf:resource="http://www.example.com/crime#CrimeFile"/>
                      <prov:wasGeneratedBy>
                      <rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">
                          <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/ProcessExecution"/>
                          <rdf:type rdf:resource="http://www.example.com/crime#FileAppending"/>
                          <prov:wasControlledBy>
                            <rdf:Description rdf:about="http://www.example.com/crimeFile#Bob">
                                <rdf:type rdf:resource="http://www.example.com/crime#Journalist"/>
                            </rdf:Description>
                          </prov:wasControlledBy>                        
                          <prov:startedAt>
                            <rdf:Description rdf:about="http://www.example.com/crimeFile#t1">
                              <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Time"/>
                              <time:inXSDDateTime>2011-10-20T16:26:45Z</time:inXSDDateTime>
                            </rdf:Description>
                          </prov:startedAt>
                          <prov:endedAt>
                            <rdf:Description rdf:about="http://www.example.com/crimeFile#t3">
                              <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Time"/>
                              <time:inXSDDateTime>2011-11-21T18:36:52Z</time:inXSDDateTime>
                            </rdf:Description>
                          </prov:endedAt>
                      </rdf:Description>
                      </prov:wasGeneratedBy>
                      <prov:wasGeneratedAt>
                        <rdf:Description rdf:about="http://www.example.com/crimeFile#t2">
                          <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Time"/>
                          <time:inXSDDateTime>2011-10-20T17:14:12Z</time:inXSDDateTime>
                        </rdf:Description>	
                      </prov:wasGeneratedAt>
                      <prov:wasDerivedFrom rdf:resource="http://www.example.com/crimeFile#e1"/>
                      <cf:hasLocation>
                          <rdf:Description rdf:about="http://www.example.com/crimeFile#sharedDirectoryLocation1">
                              <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Location"/>
                              <cf:hasFilePath rdf:datatype="http://www.w3.org/2001/XMLSchema#string">/share/crime.txt</cf:hasFilePath>
                          </rdf:Description>
                      </cf:hasLocation> 
                      <cf:hasFileContent rdf:datatype="http://www.w3.org/2001/XMLSchema#string">There 
                        was a lot of crime in London last month.</cf:hasFileContent>                 
                     </rdf:Description>
                     <rdf:Description rdf:about="http://www.example.com/crimeFile#pe2">
                         <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/ProcessExecution"/>
                         <prov:used rdf:resource="http://www.example.com/crimeFile#e2"/>
                     </rdf:Description>
                  </rdf:RDF>
			

Specialization of PROV Ontology Classes

The following new classes were created in the CrimeFile Ontology by extending the PROV ontology classes:

cf:Journalist

The cf:Journalist is a specialization of the PROV ontology Agent class and models all individuals that participate in creating, editing, and sharing the crime file.The following RDF/XML code illustrates how cf:Journalist is asserted to be a specialization of prov:Agent.

			  <rdf:Description rdf:about="http://www.example.com/crime#Journalist">
			    <rdfs:subClassOf rdf:resource="http://www.w3.org/ns/prov-o/Agent"/>
			  </rdf:Description>
			
cf:CrimeFile

The cf:CrimeFile is a specialization of the PROV ontology Entity class and it models the the file describing the crime statistics in the provenance scenario, including the multiple versions of the file. The following RDF/XML code illustrates how cf:Journalist is asserted to be a specialization of prov:Entity.

			  <rdf:Description rdf:about="http://www.example.com/crime#CrimeFile">
			    <rdfs:subClassOf rdf:resource="http://www.w3.org/ns/prov-o/Entity"/>
			  </rdf:Description>
			
cf:FileCreation, cf:FileEditing, cf:FileAppending, cf:Emailing, cf:SpellChecking

The classes cf:FileCreation, cf:FileEditing, cf:FileAppending, cf:Emailing, cf:SpellChecking are specialization of the PROV ontology ProcessExecution and model the different activities in the provenance scenario. The following RDF/XML code illustrates the specialization of the prov:ProcessExecution to define class cf:FileCreation (other classes can be similarly defined by using the subClassOf property).

			  <rdf:Description rdf:about="http://www.example.com/crime#FileCreation">
			    <rdfs:subClassOf rdf:resource="http://www.w3.org/ns/prov-o/ProcessExecution"/>
			  </rdf:Description>
			

The following diagram illustrates the above class specializations:

New classes (:SpellChecking, :FileEditing, :FileCreation, :FileAppending, :Emailing, :Journalist) extend the classes in the PROV Ontology (prov:Entity, prov:Agent, prov:ProcessExecution).
Example extension of PROV ontology in order to describe the crime file scenario

Specialization of PROV Ontology Properties

The following new object property was created in the CrimeFile Ontology by extending the PROV ontology object property:

cf:hadFilePath

The property cf:hadFilePath is a specialization of the PROV ontology hadLocation object property and links the class CrimeFile to the FileDirectory class. The following RDF/XML code illustrates the use of rdfs:subPropertyOf to create hadFilePath property.

			  <rdf:Description rdf:about="http://www.example.com/crime#hadFilePath">
			    <rdfs:subPropertyOf rdf:resource="http://www.w3.org/ns/prov-o/hadLocation"/>
			  </rdf:Description>
		  

The following diagram illustrates the above property specialization:

ext:FileCreation, ext:FileAppending, ext:FileEditing, ext:Emailing, extSpellChecking extend prov:ProcessExecution; ext:Journalist extends prov:Agent; ext:CrimeFile extends prov:Entity; ext:hadFilePath extends prov:hadLocation and has range prov:Location.
Example extension of PROV ontology in order to describe the crime file scenario

Modeling an Example Scientific Workflow Scenario

This section describes an example of extending the PROV ontology to create a provenance ontology for scientific workflows.

Scientific workflow systems allow the specification of a pipeline of processes which are linked from outputs to inputs. Such workflow definitions are typically created in a graphical user interface or interactive web application, and can then be enacted using particular inputs or parameters. Scientists in fields like bioinformatics, chemistry and physics use such workflows to perform repeated analysis by connecting together disparate set of domain-specific tools and services.

Capturing the provenance of executions in such a workflow system will typically include details of each of the process executions, such as its inputs and outputs, start and stop time, and should ultimately be able to describe the complete data lineage through the workflow for any returned output data.

This example is not attempting to be a complete or general ontology for asserting workflow provenance, but highlights how a particular application like a workflow system can express its domain specific attributes based on the PROV ontology.

New classes wf:WorkFlowEngine, wf:Process, wf:ValueAtPort, wf:FileValue, and wf:Value extend prov:Agent, prov:ProcessExecution, prov:EntityInRole. New properties wf:wasLaunchedBy, wf:ranInWorkflowEngine, wf:wasSubProcessExecutionOf, wf:wasReadFrom, wf:sawValue extend prov:wasControlledBy, prov:wasDerivedFrom.
Example extension of PROV ontology in order to describe workflow provenance

Workflow extensions to PROV classes

In order to describe workflow executions following the model above, the PROV ontology is extended with workflow-specific subclasses described below:

wf:Process
A subclass of prov:ProcessExecution to signify an execution of a process which wf:wasDefinedBy a a wf:ProcessDefinition, e.g. a workflow or a process in a workflow. A workflow process can also act as an prov:Agent when controlling nested process executions.
wf:WorkflowEngine
A subclass of prov:Agent to indicate that a workflow process was controlled by a workflow engine.
wf:Value
A subclass of prov:Entity, representing a value appearing in the workflow execution, it will typically be used or generated by wf:Process executions. The actual value can be provided with a wf:value property.
wf:ValueAtPort
A subclass of wf:Value and prov:EntityInRole, indicating a value while in the role of being used or generated by a wf:Process at a particular wf:Port.
wf:FileValue
A wf:Value which has been read from a file. As an prov:Entity this represents an entity with both attributes wf:value and wf:filename fixed, that is the entity describes the point when the given file contained the content. As the file might be read a while before the wf:Value is used by a wf:Process, at which point the file content might have changed, those values are declared as being derived from this file value using the wf:wasReadFrom property.

Workflow extensions to PROV properties

While for most cases subclassing will provide the additional expressionality the application needs, this example ontology also expands on the PROV ontology with more specific subproperties.

wf:wasDefinedBy
This sub-property of prov:hadRecipe links a wf:Process to the defining wf:ProcessDefinition. Thus, if there are multiple executions of the same workflow definition, each of the separate wf:Processes will link to the same definition.
wf:ranInWorkflowEngine
This subproperty of prov:wasControlledBy links a wf:Process to the wf:WorkflowEngine it was executed in. The engine instance might contain additional details such as which version of the workflow system was used.
wf:wasLaunchedBy
This second subproperty of prov:wasControlledBy links a wf:Process to a prov:Agent, indicating which person asked to execute the given wf:ProcessDefinition in the specified wf:WorkflowEngine.
wf:wasSubProcessExecutionOf
This subproperty of prov:wasControlledBy links a wf:Process to another prov:Process, indicating this is a child execution
Should there be a general way to state subprocesses? -Stian
wf:wasReadFrom

This subproperty of prov:wasDerivedFrom links a wf:Value to the wf:FileValue it was read from, typically when used as a workflow input. As described for wf:FileValue this distinction is done because at the time the workflow input is used in the workflow, the file input might be different and thus should not be described as an attribute of that wf:Value.

This property hints of an undescribed "Read file" process execution which is not described. This is therefore an example of how the provenance asserter is limiting the scope of its provenance. The engine knows that the file was read, but is not able or willing to provide any deeper assertions, because its primary scope is at the level of executing workflow definitions.

wf:sawValue
A subproperty of prov:wasAssumedBy which indicates that an wf:Value was wf:seenAtPort within an wf:ValueAtPort. This ValueAtPort is a complement of the pointed at Value because one can consider this entity to to have the same attributes, but in addition the wf:seenAtPort property is fixed.
wf:wasSeenAtPort
A subproperty of prov:assumedRole (not yet defined in PROV ontology) indicating which wf:Port a wf:ValueAtPort was seen at. Thus one can see at which output port a value was generated, or at which input port(s) it was used. As a functional property this requires a different wf:ValueAtPort for each use and generation of a value. The wf:ValueAtPort is linked to the wf:Entity using prov:wasComplementOf
Need prov:assumedRole in ontology -Stian

Workflow structure

This ontology includes a simple definition language for describing the overall workflow structure. This is not meant as a general workflow definition language, but allows us to describe process executions, use and generation with relation to particular sections of the workflow definition.

wf:ProcessDefinition
A definition of how to execute a process. It will typically refer to a command or service which will be called. Each process definition also wf:definesInputs and wf:definesOutputs.
wf:Port
A port can be considered as a parameter or return value for a process. These are typically given names which are unique within a process definition. A value is either provided to an input port before execution, or produced from an output port after execution.
wf:linksTo
Ports are connected using links. A link from an output port to an input port means that the value received on that output will be forwarded to the input of the next process. Note that in this simplified ontology links can also go from Input to Input and Output to Output, these are used to connect workflow ports to processor ports.
wf:Input
An input port for a process will receive a value which will be used by the execution. In a dataflow driven workflow model, a process will execute as soon as all its defined input ports have been provided with values.
wf:Output
A process execution might return multiple outputs, for instance a table and a diagram. Each of these are declared as an output port for that process definition.
wf:definesSubProcess

Scientific workflows can be composed of nested workflows which can be shared and reused as components. Some workflow systems also allow various execution settings on the nested workflow, like looping or parallelisation.

In this case a process definition will use wf:definesSubProcess to indicate its consistent parts, and there will be additional wf:linksTo from the input ports of this process definition to the input ports of some of its nested sub processes, and vice versa for the outputs. The top-level workflow is always such a process definition.

Example workflow

An example workflow with input, three processes, and two outputs.

This is an example workflow which defines a workflow input input, three processes String_constant, Concatenate_two_strings and sha1, and finally two workflow outputs combined and sha1. When executed, it will execute from top to bottom, first concatenating the provided input with the string constant, which is returned on the combined output, but also provided to the sha1 process, which output is given to the other workflow port.

Using the definition ontology above this workflow can be expressed in RDF/XML as:

<rdf:RDF xml:base="http://www.example.com/workflow1#"
    xmlns:impl="http://company.example.org/engine-implementation#"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:wf="http://www.example.com/scientific-workflow#">

    <wf:ProcessDefinition rdf:about="#workflow">
        <rdf:type rdf:resource="http://company.example.org/engine-implementation#Workflow"/>
        <wf:definesInput>
            <wf:Input rdf:about="#inName">
                <wf:linksTo rdf:resource="#catIn2" />
            </wf:Input>
        </wf:definesInput>
        <wf:definesOutput rdf:resource="#combined" />
        <wf:definesOutput rdf:resource="#sha1" />
        <wf:definesSubProcess>
            <impl:Constant rdf:about="#String_constant">
                <impl:constant>Hello, </impl:constant>
                <wf:definesOutput>
                    <wf:Output rdf:about="#constantValue">
                        <wf:linksTo rdf:resource="#catIn1"/>
                    </wf:Output>
                </wf:definesOutput>
            </impl:Constant>
        </wf:definesSubProcess>
        <wf:definesSubProcess>
            <impl:Command rdf:about="#cat">
                <impl:command>cat</impl:command>
                <wf:definesInput rdf:resource="#catIn1" />
                <wf:definesInput rdf:resource="#catIn2" />
                <wf:definesOutput>
                    <wf:Output rdf:about="#catOut">
                        <wf:linksTo rdf:resource="#shaIn"/>
                    </wf:Output>
                </wf:definesOutput>
            </impl:Command>
        </wf:definesSubProcess>
        <wf:definesSubProcess>
            <impl:Command rdf:about="#shasum">
                <impl:command>shasum</impl:command>
                <wf:definesInput rdf:resource="#shaIn" />
                <wf:definesOutput>
                    <wf:Output rdf:about="#shaOut">
                        <wf:linksTo rdf:resource="#sha1"/>
                    </wf:Output>
                </wf:definesOutput>
            </impl:Command>
        </wf:definesSubProcess>
    </wf:ProcessDefinition>
</rdf:RDF>            
        

Example workflow run

This example shows how using the workflow extensions together with PROV can provide the provenance of executing the workflow defined above.

<rdf:RDF 
    xmlns:cnt="http://www.w3.org/2011/content#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:prov="http://www.w3.org/ns/prov-o/"
    xmlns:time="http://www.w3.org/2006/time#"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:wf="http://www.example.com/scientific-workflow#"
    xmlns:run="http://www.example.com/run1#"
    xmlns:base="http://www.example.com/run1#"
    >

    <prov:Agent rdf:about="#aUser">
        <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
        <foaf:name>Stian Soiland-Reyes</foaf:name>
    </prov:Agent>

    <wf:WorkflowEngine rdf:about="#workflowEngine" />

    <wf:FileValue rdf:about="#inputFile">
        <wf:file>/tmp/myinput.txt</wf:file>
        <wf:value>
            <cnt:ContentAsText>
                <cnt:characterEncoding>UTF-8</cnt:characterEncoding>
                <cnt:chars>Steve</cnt:chars>
            </cnt:ContentAsText>
        </wf:value>
    </wf:FileValue>

    <wf:Value rdf:about="#input">
        <wf:wasReadFrom rdf:resource="#inputFile"/>
        <wf:value>
            <cnt:ContentAsText>
                <cnt:characterEncoding>UTF-8</cnt:characterEncoding>
                <cnt:chars>Steve</cnt:chars>
            </cnt:ContentAsText>
        </wf:value>
    </wf:Value>

    <wf:Process rdf:about="#workflowRun">
        <prov:used>
            <wf:ValueAtPort>
                <wf:sawValue rdf:resource="#input"/>
                <wf:seenAtPort rdf:resource="http://www.example.com/workflow1#inName"/>
                <prov:assumedRoleAt>
                    <prov:Time>
                        <time:inDateTimeXSD>2011-10-21T09:21:31Z</time:inDateTimeXSD>
                    </prov:Time>
                </prov:assumedRoleAt>
            </wf:ValueAtPort>
        </prov:used>
        <wf:ranInWorkflowEngine rdf:resource="#workflowEngine"/>
        <wf:wasLaunchedBy rdf:resource="#aUser"/>
        <wf:wasDefinedBy rdf:resource="http://www.example.com/workflow1#workflow"/>
        <prov:startedAt>
            <prov:Time>
                <time:inDateTimeXSD>2011-10-21T09:20:15Z</time:inDateTimeXSD>
            </prov:Time>
        </prov:startedAt>
        <prov:endedAt>
            <prov:Time>
                <time:inDateTimeXSD>2011-10-21T09:23:32Z</time:inDateTimeXSD>
            </prov:Time>
        </prov:endedAt>
    </wf:Process>

    <wf:Process rdf:about="#constant">
        <wf:wasSubProcessExecutionOf rdf:resource="#workflowRun"/>
        <wf:wasDefinedBy
        rdf:resource="http://www.example.com/workflow1#String_Constant"/>
        <prov:startedAt>
            <prov:Time rdf:about="#t0">
                <time:inDateTimeXSD>2011-10-21T09:20:15Z</time:inDateTimeXSD>
            </prov:Time>
        </prov:startedAt>
        <prov:endedAt rdf:resource="#t0" />
    </wf:Process>

    <wf:Value rdf:about="#hello">
        <prov:wasGeneratedBy rdf:resource="#constant"/>
        <prov:wasGeneratedAt rdf:resource="#t0"/>
        <prov:endedAt rdf:resource="#t0" />
        <wf:value>
            <cnt:ContentAsText>
                <cnt:chars>Hello, </cnt:chars>
            </cnt:ContentAsText>
        </wf:value>
    </wf:Value>

    <wf:ValueAtPort rdf:about="#helloValue">
        <prov:wasGeneratedBy rdf:resource="#constant"/>
        <wf:value>
            <cnt:ContentAsText>
                <cnt:chars>Hello, </cnt:chars>
            </cnt:ContentAsText>
        </wf:value>
        <wf:sawEntity rdf:resource="#hello"/>
    </wf:ValueAtPort>

    <wf:Process rdf:about="#combine">
        <prov:used>
          <wf:ValueAtPort>
            <wf:sawValue rdf:resource="#hello"/>
            <wf:seenAtPort rdf:resource="http://www.example.com/workflow1#catIn1"/>
            <prov:assumedRoleAt>
                <prov:Time>
                    <time:inDateTimeXSD>2011-10-21T09:20:21Z</time:inDateTimeXSD>
                </prov:Time>
            </prov:assumedRoleAt>
          </wf:ValueAtPort>
        </prov:used>
        <prov:used>
          <wf:ValueAtPort>
            <wf:sawValue rdf:resource="#input"/>
            <wf:seenAtPort rdf:resource="http://www.example.com/workflow1#catIn2"/>
            <prov:assumedRoleAt>
                <prov:Time>
                    <time:inDateTimeXSD>2011-10-21T09:20:23Z</time:inDateTimeXSD>
                </prov:Time>
            </prov:assumedRoleAt>
          </wf:ValueAtPort>
        </prov:used>
        <wf:wasSubProcessExecutionOf rdf:resource="#workflowRun"/>
        <wf:wasDefinedBy rdf:resource="http://www.example.com/workflow1#cat"/>
        <prov:startedAt>
            <prov:Time>
                <time:inDateTimeXSD>2011-10-21T09:20:20Z</time:inDateTimeXSD>
            </prov:Time>
        </prov:startedAt>
        <prov:endedAt>
            <prov:Time>
                <time:inDateTimeXSD>2011-10-21T09:20:25Z</time:inDateTimeXSD>
            </prov:Time>
        </prov:endedAt>
    </wf:Process>

    <wf:Value rdf:about="#combined">
        <prov:wasGeneratedBy rdf:resource="#combine"/>
        <wf:value>
            <cnt:ContentAsText>
                <cnt:chars>Hello, Steve</cnt:chars>
            </cnt:ContentAsText>
        </wf:value>
    </wf:Value>

    <wf:Process rdf:about="#shasum">
        <prov:used rdf:resource="#combined"/>
        <wf:wasSubProcessExecutionOf rdf:resource="#workflowRun"/>
        <wf:wasDefinedBy rdf:resource="http://www.example.com/workflow1#shasum"/>
        <prov:startedAt>
            <prov:Time>
                <time:inDateTimeXSD>2011-10-21T09:20:30Z</time:inDateTimeXSD>
            </prov:Time>
        </prov:startedAt>
        <prov:endedAt>
            <prov:Time>
                <time:inDateTimeXSD>2011-10-21T09:21:00Z</time:inDateTimeXSD>
            </prov:Time>
        </prov:endedAt>
    </wf:Process>

    <wf:Value rdf:about="#sha1">
        <wf:value>
            <cnt:ContentAsText>
                <cnt:characterEncoding>UTF-8</cnt:characterEncoding>
                <cnt:chars>a33d1fb1658d4fbf017de59ab67437a3eb5ff50d</cnt:chars>
            </cnt:ContentAsText>
        </wf:value>
    </wf:Value>

    <wf:ValueAtPort rdf:about="#sha1OutputFromShasum">
        <prov:wasGeneratedBy rdf:resource="#shasum"/>
        <wf:value>
            <cnt:ContentAsText>
                <cnt:characterEncoding>UTF-8</cnt:characterEncoding>
                <cnt:chars>a33d1fb1658d4fbf017de59ab67437a3eb5ff50d</cnt:chars>
            </cnt:ContentAsText>
        </wf:value>
        <wf:sawValue rdf:resource="#sha1"/>
        <wf:wasSeenAt rdf:resource="http://www.example.com/workflow1#shaOut"/>
    </wf:ValueAtPort>

    <wf:ValueAtPort rdf:about="#sha1OutputFromWorkflow">
        <prov:wasGeneratedBy rdf:resource="#workflowRun"/>
        <wf:value>
            <cnt:ContentAsText>
                <cnt:characterEncoding>UTF-8</cnt:characterEncoding>
                <cnt:chars>a33d1fb1658d4fbf017de59ab67437a3eb5ff50d</cnt:chars>
            </cnt:ContentAsText>
        </wf:value>
        <wf:sawValue rdf:resource="#sha1"/>
        <wf:wasSeenAt rdf:resource="http://www.example.com/workflow1#sha1"/>
    </wf:ValueAtPort>

</rdf:RDF>

            
Example available as RDF/XML and Turtle

Note that for brevity, the example above does not show the inferred classes and properties from the PROV ontology. For interoperability, applications should also expressed such inferred statements in its serialisations, so that the provenance can be read without using OWL2 inferencing and the customized ontologies. See the workflow-inferred.rdf for the complete example showing both domain-specific and PROV ontology terms used side by side.

Formal Semantics of the PROV Ontology

The PROV ontology uses OWL2 as the ontology language, hence it supports a set of entailments based on the standard RDF semantics [[!RDF-MT]] and OWL2 semantics ([[!OWL2-DIRECT-SEMANTICS]], [[!OWL2-RDF-BASED-SEMANTICS]]). In this section, we describe these set of semantics as applied to the PROV ontology along with a set of constraints introduced in the PROV-DM [[PROV-DM]] that are provenance-specific. It is intended that provenance applications can leverage this normative description of the formal semantics of PROV ontology to support:

RDF Semantics for PROV Ontology

We briefly summarize the essential features of the RDF Semantics and refer to the RDF semantics [[!RDF-MT]] for the normative specification. The RDF Semantics uses model theory, with a notion of interpretation I defined over RDF (rdf-interpretation) or RDFS (rdfs-interpretation) vocabulary, for specifying the formal semantics of a RDF or RDFS graph [[!RDF-MT]]. The rdf-interpretation is an interpretation that satisfies a set of constraints called "RDF semantic conditions" and a set of "RDF axiomatic triples" (see Section 3.1 of RDF Semantics [[!RDF-MT]]). The rdfs-interpretation is defined over the additional terms in the RDFS vocabulary, including rdfs:domain, rdfs:range, rdfs:Class, rdfs:subClassOf, and rdfs:subPropertyOf. An rdfs-interpretation satisfies a set of constraints called "RDFS semantic conditions" and "RDFS axiomatic triples" (see Section 4.1 of RDFS Semantics [[!RDF-MT]]).

The rdfs-interpretation supports the following set of the entailment rules that are applicable to the PROV ontology (we do not discuss the simple RDF entailments):

Rule 1

If a PROV ontology class X is defined to be domain of a PROV property, then an individual asserted as "subject" of that property in a RDF triple is an instance of the class X. (from rdf2 Rule defined in RDF Semantics)

Rule 2

Similar to Rule 1, if a PROV ontology class Y is defined to be range of a PROV object property, then an individual asserted as "object" of that property in a RDF triple is an instance of the class Y. (from rdf3 Rule defined in RDF Semantics)

Rule 3

Both the rdfs:subClassOf and rdfs:subPropertyOf are transitive properties, hence provenance assertions, in form of RDF triples, using a specialized sub class or sub property can be inferred to be true for their parent class or parent property. For example, in the provenance scenario, though alice and bob are asserted to be individuals of the class Journalist, we can infer that they are also individuals of the PROV ontology class Agent and Entity. Given,

			  <rdf:Description rdf:about="http://www.example.com/crimeFile#Alice">
                  <rdf:type rdf:resource="http://www.example.com/crime#Journalist"/>
			  </rdf:Description>
		  

and

			  <rdf:Description rdf:about="http://www.example.com/crime#Journalist">
                  <rdfs:subClassOf rdf:resource="http://www.w3.org/ns/prov-o/Agent"/>
			  </rdf:Description>
			  <rdf:Description rdf:about="http://www.w3.org/ns/prov-o/Agent">
                  <rdfs:subClassOf rdf:resource="http://www.w3.org/ns/prov-o/Entity"/>
			  </rdf:Description>
		  

we can infer that

			  <rdf:Description rdf:about="http://www.example.com/crimeFile#Alice">
                  <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Agent"/>
			  </rdf:Description>
		  

and

			  <rdf:Description rdf:about="http://www.example.com/crimeFile#Alice">
                  <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Entity"/>
			  </rdf:Description>
		  

OWL2 Semantics for PROV Ontology

In addition to RDF Semantics, the OWL2 semantics as described in [[!OWL2-DIRECT-SEMANTICS]], [[!OWL2-RDF-BASED-SEMANTICS]] are also applicable to PROV ontology. We consider the OWL2 RDF-Based Semantics (since it is a semantics superset of OWL2 Direct Semantics) and specifically the extension of the D-interpretation, which satisfies the constraints for rdf-interpretation, rdfs-interpretation (as defined in previous section), graphs with blank nodes, and interpretation defined for RDF datatypes (see Section 5.1 in RDF Semantics [[!RDF-MT]]). The OWL2 RDF-based semantics introduces the notion of "facets" to constrain datatypes, both the rdf:XMLLiteral defined in the RDF Semantics [[!RDF-MT]] and datatypes defined in the OWL2 Structural Specifications [[!OWL2-SYNTAX]]. The OWL2 RDF-based interpretation, also called D-interpretation with facets is a D-interpretation that also satisfies the OWL2 RDF-based semantics called "semantic constraints" (see Section 5 in OWL2 RDF-Based Semantics [[!OWL2-RDF-BASED-SEMANTICS]]).

Provenance-specific Entailments Supported by PROV Ontology

The PROV-DM [[PROV-DM]] introduces a set of specific constraints applicable to PROV ontology. The following is a list of constraints that will be supported by the PROV ontology and any provenance application that uses the PROV ontology.

Provenance constraint on ProcessExecution

The PROV-DM describes a constraint on ordering of time (or event) associated with a ProcessExecution.

"From a process execution expression, one can infer that the start event precedes the end event of the represented activity." This is ISSUE-121

Provenance constraint on wasGeneratedBy (generation-affects-attributes)

The PROV-DM describes a constraint on wasGeneratedBy that associates the values of attributes of an Entity with the ProcessExecution that generated the Entity.

"Given a process execution pe, entity e, role r, and optional time t, if the assertion wasGeneratedBy(e,pe,r) or wasGeneratedBy(e,pe,r,t) holds, the values of some of e's attributes are determined by the activity denoted by pe and the entities used by pe. Only some (possibly none) of the attributes values may be determined since, in an open world, not all used entities may have been asserted." This is ISSUE-122 and ISSUE-105

Provenance constraint on wasGeneratedBy (generation-pe-ordering)

The second constraint on wasGeneratedBy associates an ordering of events associated with the generation of an Entity instance and the start, end time or event of the PE instance.

Without an explicit association of TemporalEntity with the Entity instance and PE instance, it is not possible to state or enforce this constraint in the PROV ontology schema and the corresponding RDF dataset.

Provenance constraint on wasGeneratedBy (generation-unicity)

The PROV-DM describes a constraint on wasGeneratedBy that asserts that given an account, only one PE instance can be associated to an Entity instance by the property wasGeneratedBy.

"Given an entity expression denoted by e, two process execution expressions denoted by pe1 and pe2, and two qualifiers q1 and q2, if the expressions wasGeneratedBy(e,pe1,q1) and wasGeneratedBy(e,pe2,q2) exist in the scope of a given account, then pe1=pe2 and q1=q2." This is ISSUE-105

Provenance constraint on Used (use-attributes)

A constraint is defined for the Used relation in PROV-DM, that makes it necessary for an attribute-value to be true for an Entity instance linked to a ProcessExecution instance by relation Used.

"Given a process execution expression identified by pe, an entity expression identified by e, a qualifier q, and optional time t, if assertion used(pe,e,q) or used(pe,e,q,t) holds, then the existence of an attribute-value pair in the entity expression identified by e is a pre-condition for the termination of the activity represented by the process execution expression identified by pe." This is ISSUE-124

Provenance constraint on Used (use-pe-ordering)

The PROV-DM describes a constraint for Used relation, which makes it necessary for an Entity instance e (linked to a ProcessExecution instance pe by Used relation) to be "used" before pe terminates and also the "generation" of e precedes "use" of e.

"Given a process execution expression identified by pe, an entity expression identified by e, a qualifier q, and optional time t, if assertion used(pe,e,q) or used(pe,e,q,t) holds, then the use of the thing represented by entity expression identified by e precedes the end time contained in the process execution expression identified by pe and follows its beginning. Furthermore, the generation of the thing denoted by entity expression identified by e always precedes its use." This is ISSUE-124

Provenance constraint on wasDerivedFrom (derivation-attributes)

The PROV-DM describes a constraint for asserting wasDerivedFrom property between two Entity instances if some attributes of an Entity instance are partially or fully determined by attributes values of the other Entity instance.

"Given a process execution expression denoted by pe, entity expressions denoted by e1 and e2, qualifiers q1 and q2, the assertion wasDerivedFrom(e2,e1,pe,q2,q1) or wasDerivedFrom(e2,e1) holds if and only if the values of some attributes of the entity expression identified by e2 are partly or fully determined by the values of some attributes of the entity expression identified by e1." This is ISSUE-125

Provenance constraint on wasDerivedFrom (derivation-use-generation-ordering)

The PROV-DM describes a constraint that if wasDerivedFrom property is asserted between two Entity instances e1 and e2, that is wasDerivedFrom (e2, e1), then the for time instant t1 associated with a PE instance that "used" is less than the time instant t2 associated with "generation" of e2.

Without an explicit association of TemporalEntity with the Entity instance and PE instance, it is not possible to state or enforce this constraint in the PROV ontology schema and the corresponding RDF dataset.

Provenance constraint on wasDerivedFrom (derivation-events)

The PROV-DM describes a constraint that if wasDerivedFrom property is asserted between two Entity instances e1 and e2, that is wasDerivedFrom (e2, e1, pe), then wasGeneratedBy(e2, pe) and used(pe, e1) can also be asserted.

Since the above constraint defined in PROV-DM does not define how pe is linked to the derivation of e2 from e1, this constraint can be supported in the "opposite" direction in PROV-O. In other words, given e2 was generated at time instant t2 by pe and pe used e1 at time instant t1 and t1 is less than t2, then we can assert that wasDerivedFrom(e2, e1).

Provenance constraint on wasDerivedFrom (derivation-events)

The PROV-DM describes a constraint that if wasDerivedFrom property is asserted between two Entity instances e1 and e2, then there exists some PE instance such that wasGeneratedBy(e2, pe) and used(pe, e1) can also be asserted.

This constraint is a re-statement of the generic Semantic Web "open-world assumption". Hence, it is not mapped to PROV ontology.

Provenance constraint on wasDerivedFrom (derivation-use)

The PROV-DM describes a constraint that if wasDerivedFrom property is asserted between two Entity instances e1 and e2, and wasGeneratedBy(e2, pe) is also asserted then Used(pe, e1) can also be asserted.

This will be asserted as a rule.

Provenance constraint on wasEventuallyDerivedFrom (derivation-generation-generation-ordering)

The PROV-DM describes a constraint that if wasEventuallyDerivedFrom property is asserted between two Entity instances e1 and e2, then generation of e1 occurred before generation of e2.

Without an explicit association of TemporalEntity (or event) with the Entity instance and PE instance, it is not possible to state or enforce this constraint in the PROV ontology schema and the corresponding RDF dataset.

Provenance constraint on wasEventuallyDerivedFrom (derivation-linked-independent)

The PROV-DM describes a constraint that if wasDerivedFrom property is asserted between two Entity instances e1 and e2, then wasEventuallyDerivedFrom property can also be asserted between the two Entity instances.

Is this an equivalence constraint or can we assert a subPropertyOf property between wasDerivedFrom and wasEventuallyFrom?

Provenance constraint on wasComplementOf (wasComplementOf-necessary-cond)

The PROV-DM describes a constraint that wasComplementOf property holds between two entities over a temporal intersection of the two entities.

Without an explicit association of time value, this constraint cannot be stated or enforced in PROV ontology.

Provenance constraint on hadParticipant (participant)

The PROV-DM describes a constraint that hadParticipant property holds between an instance of Entity and instance of ProcessExecution if the two instances are linked by "used" of "wasControlledBy" or "wasComplementOf".

"Given two identifiers pe and e, respectively identifying a process execution expression and an entity expression, the expression hadParticipant(pe,e) holds if and only if: *used(pe,e) holds, or *wasControlledBy(pe,e) holds, or *wasComplementOf(e1,e) holds for some entity expression identified by e1, and hadParticipant(pe,e1) holds some process execution expression identified by pe." This is ISSUE-127

Acknowledgements

The Provenance Working Group Members.