Copyright © 2012 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
PROV-DM, the PROV conceptual data model, is a data model for provenance that describes the entities, people and activities involved in producing a piece of data or thing. PROV-DM distinguishes core structures, forming the essence of provenance descriptions, from extended structures catering for more advanced uses of provenance. PROV-DM is structuredorganized in six components, respectively dealing with: (1) entities and activities, and the time at which they were created, used, or ended; (2) agents bearing responsibility for entities that were generated and activities that happened; (3) derivations of entities from entities; (4) properties to link entities that refer to the same thing; (5) a notion of bundle, a mechanism to support provenance of provenance; and, (6) collections forming a logical structure for its members; (6) a simple annotation mechanism.members.
To provide examples of the PROV data model, the PROV notation (PROV-N) is introduced: aimed at human consumption, PROV-N allows serializations of PROV instances to be created in a compact manner. PROV-N facilitates the mapping of the PROV data model to concrete syntax, and is used as the basis for a formal semantics of PROV. The purpose of this document is to define the PROV-N notation.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is the fifth public release of the PROV-DM document. Publication as Last Call working draft means that the Working Group believes that it has satisfied the relevant technical requirements outlined in its charter on this document. The design is not expected to change significantly, going forward, and now is the key time for external review, before the implementation phase.
The PROV Working group seeks public feedback on this Working Draft. The end date of the Last Call review period is TBD, and we would appreciate comments by that date to public-prov-comments@w3.org
This is the first public release of the PROV-N document. Following feedback, the Working Group has decided to reorganize the PROV-DM document substantially, separating the data model, from its constraints, and the notation used to illustrate it. The PROV-N release is synchronized with the release of the PROV-DM, PROV-O, PROV-PRIMER, and PROV-CONSTRAINTS documents. This document was published by the Provenance Working Group as a First Public Workingan Editor's Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-prov-wg@w3.org (subscribe, archives). All feedback is welcome.
Publication as a Workingan Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
Provenance is defined as a record that describes the people, institutions, entities, and activities, involved in producing, influencing, or delivering a piece of data or a thing in the world. Two companion specifications respectively define PROV-DM, a data model for provenance, allowing provenance descriptions to be expressed [PROV-DM] and a set of constraints that provenance descriptions are expected to satisfy [PROV-CONSTRAINTS].
This document introduces the PROV-N grammar along with examples of its usage.
Its target audience is twofold:
This document defines a grammar using the Extended Backus-Naur Form (EBNF) notation. Its productions correspond to PROV data model types and relations. structured as follows.
It is structured as follows.Section 2 provides the design rationale for the PROV Notation. Section 3 defines the notation for the Extended Backus-Naur Form (EBNF) grammar used in this specification.general consideration about the PROV-N grammar.
Section 43 presents the grammar of all expressions of the language grouped according to the PROV data model components.
Section 54 defines the grammar of containers,toplevel bundles, a house-keeping construct of PROV-N capable of packaging up PROV-N expressions and namespace declarations.
Section 6 defines the grammar of accounts.Section 75 defines media type for the PROV-N notation.
The key words "must", "must not", "required", "shall", "shall not", "should", "should not", "recommended", "may", and "optional" in this document are to be interpreted as described in [RFC2119].
The following namespaces prefixes are used throughout this document.
prov | http://www.w3.org/ns/prov# | The PROV namespace (see Section 4.7.1) |
xsd | http://www.w3.org/2000/10/XMLSchema# | XML Schema Namespace [XMLSCHEMA-2] |
rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# | The RDF namespace [RDF-CONCEPTS] |
(others) | (various) | All other namespace prefixes are used in examples only. In particular, URIs starting with "http://example.com" represent some application-dependent URI [URI] |
PROV-N adopts a functional-style syntax consisting of a predicate name and an ordered list of terms. All PROV data model relations involve two primary elements, the subject and the object, in this order. Furthermore, some expressions also admit additional elements that further characterize it.
wasDerivedFrom(e2, e1)
wasDerivedFrom(e2, e1, a, g2, u1)
activity(a2, 2011-11-16T16:00:00, 2011-11-16T16:00:01)
The grammar is specified using the Extended Backus-Naur Form (EBNF) notation.
Within the term on the right-hand side of a rule, the following terms are used to match strings of one or more characters:
[2] | expression
::=
| ::= | ( entityExpression
| activityExpression
| generationExpression
| usageExpression
| | startExpression
| endExpression
| invalidationExpression
| communicationExpression
| startByActivityExpression
| agentExpression
| associationExpression | attributionExpression
| associationExpression
delegationExpression | responsibilityExpression
| derivationExpression
| revisionExpression
tracedToExpression | quotationExpression
| hadOriginalSourceExpression
alternateExpression | traceExpression
| alternateExpression
specializationExpression | specializationExpression
contextualizationExpression | insertionExpression | removalExpression | membershipExpression ) |
A PROV-N document consists of a collection of expressions, wrapped in a toplevel bundle with some namespace declarations, such that the text for an element matches the corresponding expression container with some namespace declarations, such that the text for an element matches the corresponding expression production of the grammar.
wasDerivedFrom(e2, e1, a, g2, u1) wasDerivedFrom(e2, e1)In a derivation expression, the activity, generation, and usage are optional terms. They are specified in the first derivation, but not in the second.
activity(a2, 2011-11-16T16:00:00, 2011-11-16T16:00:01) activity(a1)The start and end times for Activity a1 are optional. They are specified in the first expression, but not in the second.
The general rule for optionals is that, if none of the optionals are used in the expression, then they are simply omitted, resulting in a simpler expression as in the examples above.
However, it may be the case that only some of the optional terms are omitted. Because the position of the terms in the expression matters, an additional marker must be used to indicate that a particular term is not available. The symbol -'-' is used for this purpose.In the first expression below, all optionals are specified. However in the second,second and third, only the last one optional is specified, forcing the use of the marker for the missing terms. In the last, no marker is necessary because all remaining optionals after a are missing.
wasDerivedFrom(e2, e1, a, g2, u1) wasDerivedFrom(e2, e1, -, -, u1) wasDerivedFrom(e2, e1, a)a, -, -)
activity(a1) activity(a1, -, -)
Most expressions defined in the grammar include the use of two terms: an identifier and a set of attribute-value pairs, delimited by square brackets. Identifiers are optional except for the predicate,Entities, Activites, and Agents. Identifiers are always the first term in any expression. By convention, optional identifiers are separated using a semi-colon ';'. This makes it possible to completely omit an optional identifier with no ambiguity arising. Also, if the set of attribute-value pairs, delimited by square brackets. Both arepairs is present, it is always the last term in any expression.
Derivation has an optional (unless specified otherwise). By convention,identifier. In the first expression, the identifier is the first termnot available, while it is explicit in any expression, and the set of attribute-value pairs is the last. Consistent with the convention on optional terms, the 'the second. The third example shows that one can optionally indicate the missing identifier using the -' marker can be used when the identifier is not available. Additionally, the grammar rules are defined in such a way that the optional identifier can be omitted altogether with no ambiguity arising. Derivation has an optional identifier. In the first expression, the identifier is not available. It is explicit in the second, and marked by a - in the third. marker.
wasDerivedFrom(e2, e1) wasDerivedFrom(d,wasDerivedFrom(d; e2, e1) wasDerivedFrom(-,wasDerivedFrom(-; e2, e1)
The first activity does notand second activities have anyno attributes. The second has an empty list of attributes. The third activity has two attributes.
activity(ex:a10) activity(ex:a10, []) activity(ex:a10, [ex:param1="a", ex:param2="b"])
This section introduces grammar productions for each expression, followed by small examples of expressions illustrating the use ofgrammar. Strings conforming to the grammar are valid expressions in PROV-N. Strings conforming to the grammar are valid expressions in the PROV-N language.
[3] | entityExpression | ::= | "entity" "(" identifier optionalAttributeValuePairs ")" |
[4] | optionalAttributeValuePairs | ::= | ( "," "[" attributeValuePairs "]" )? |
[5] | attributeValuePairs | ::= | ( | attributeValuePair ( "," attributeValuePair )* ) |
[6] | attributeValuePair | ::= | attribute "=" literal |
The following table summarizes how each constituent of a PROV-DM Entity maps to a non-terminal.
entityExpression ::= entity (entity(tr:WD-prov-dm-20111215, [ prov:type="document" ])Here tr:WD-prov-dm-20111215 is the optional entity identifier, and [ prov:type="document" ] groups the optional attributes attributes, only one in this example, with their values.
entity(tr:WD-prov-dm-20111215)HereHere, the optional attributes are not used.absent.
[7] | activityExpression ::=
activity
| ::= | "activity" "(" identifier (
"," timeOrMarker "," timeOrMarker )? optionalAttributeValuePairs ")" |
[8] | timeOrMarker | ::= | ( time | "-" ) |
The following table summarizes how each constituent of a PROV-DM Activity maps to a non-terminal.
Activity | Non-Terminal |
id | identifier
,
(time | - )
,
(time | - )
optional-attribute-values
) |
startTime | timeOrMarker |
endTime | timeOrMarker |
attributes | optionalAttributeValuePairs |
activity(ex:a10, 2011-11-16T16:00:00, 2011-11-16T16:00:01, [prov:type="createFile"])
Here ex:a10 is the optional activity identifier, 2011-11-16T16:00:00 and 2011-11-16T16:00:01 are the optional start and end times for the activity, and [prov:type="createFile"] are optional attributes.
The remaining examples show cases where some of the optionals are omitted.activity(ex:a10) activity(ex:a10, -, -) activity(ex:a10, -, -, [prov:type="edit"]) activity(ex:a10, -, 2011-11-16T16:00:00) activity(ex:a10, 2011-11-16T16:00:00, -) activity(ex:a10, 2011-11-16T16:00:00, -, [prov:type="createFile"]) activity(ex:a10, [prov:type="edit"])
[9] | generationExpression ::=
wasGeneratedBy
| ::= | "wasGeneratedBy" "(" optionalIdentifier eIdentifier (
"," aIdentifierOrMarker "," timeOrMarker )? optionalAttributeValuePairs ")" |
[10] | optionalIdentifier | ::= | ( identifierOrMarker ";" )? |
[11] | identifierOrMarker | ::= | ( identifier | -"-" ) ,
|
The following table summarizes how each constituent of a PROV-DM Generation maps to a non-terminal.
wasGeneratedBy(ex:g1,wasGeneratedBy(ex:g1; tr:WD-prov-dm-20111215, ex:edit1, 2011-11-16T16:00:00, [ex:fct="save"])
Here ex:g1 is the optional generation identifier, tr:WD-prov-dm-20111215 is the identifier of the entity being generated, ex:edit1 is the optional identifier of the generating activity, 2011-11-16T16:00:00 is the optional generation time, and [ex:fct="save"] are optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasGeneratedBy(tr:WD-prov-dm-20111215, ex:edit1, -) wasGeneratedBy(tr:WD-prov-dm-20111215,wasGeneratedBy(e2, a1, tr:WD-prov-dm-20111215) wasGeneratedBy(e, ex:edit1, 2011-11-16T16:00:00) wasGeneratedBy(e2, a1, -, [ex:fct="save"]) wasGeneratedBy(e2, -, -, [ex:fct="save"]) wasGeneratedBy(ex:g1, tr:WD-prov-dm-20111215, ex:edit1, -) wasGeneratedBy(-, tr:WD-prov-dm-20111215, ex:edit1, -)wasGeneratedBy(ex:g1; e) wasGeneratedBy(ex:g1; e, a, tr:WD-prov-dm-20111215)
Even though the production generationExpression
allows for expressions wasGeneratedBy(e2, -, -) and wasGeneratedBy(-,wasGeneratedBy(-; e2, -, -), these expressions are not valid in PROV-N, since
at least one one of id, activity, time, orand attributes must be present.
[12] | usageExpression ::=
used
| ::= | "used" "(" optionalIdentifier aIdentifier "," (
( identifier | - ) ,
"," eIdentifierOrMarker "," timeOrMarker )? optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Usage maps to a non-terminal.
Usage | Non-Terminal |
id | optionalIdentifier |
activity | aIdentifier
,
eIdentifier
, ( |
entity | eIdentifierOrMarker |
time | - ) optional-attribute-values ) | timeOrMarker |
attributes | optionalAttributeValuePairs |
used(ex:u1,used(ex:u1; ex:act2, ar3:0111, 2011-11-16T16:00:00, [ex:fct="load"])
Here ex:u1 is the optional usage identifier, ex:act2 is the identifier of the using activity, ar3:0111 is the identifier of the entity being used, 2011-11-16T16:00:00 is the optional usage time, and [ex:fct="load"] are optional attributes.
The remaining examples show cases where some of the optionals are omitted.used(ex:act2) used(ex:act2, ar3:0111, -) used(ex:act2, ar3:0111, 2011-11-16T16:00:00) used(a1,e1, -, [ex:fct="load"]) used(ex:u1, ex:act2, ar3:0111, -) used(-,used(ex:u1; ex:act2, ar3:0111, -)
Even though the production usageExpression
allows for expressions used(a2, -, -) and used(-; e2, -, -), these expressions are not valid in PROV-N, since at least one of id, entity, time, and attributes must be present.
[16] | communicationExpression | ::= | "wasInformedBy" "(" optionalIdentifier aIdentifier "," aIdentifier optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Communication maps to a non-terminal.
Communication | Non-Terminal |
id | optionalIdentifier |
informed | aIdentifier |
informant | aIdentifier |
attributes | optionalAttributeValuePairs |
wasInformedBy(ex:inf1; ex:a1, ex:a2, [ex:param1="a", ex:param2="b"])
Here ex:inf1 is the optional communication identifier, ex:a1 is the identifier of the informed activity, ex:a2 is the identifier of the informant activity, and [ex:param1="a", ex:param2="b"] are optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasInformedBy(ex:a1, ex:a2)
wasInformedBy(ex:a1, ex:a2, [ex:param1="a", ex:param2="b"])
wasInformedBy(i; ex:a1, ex:a2)
wasInformedBy(i; ex:a1, ex:a2, [ex:param1="a", ex:param2="b"])
[13] | startExpression ::=
wasStartedBy
| ::= | "wasStartedBy" "(" optionalIdentifier aIdentifier (
( identifier | - ) ,
"," eIdentifierOrMarker "," aIdentifierOrMarker "," timeOrMarker )? optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Start maps to a non-terminal.
wasStartedBy(s,wasStartedBy(s; ex:act2, ar3:0111,ex:trigger, ex:act1, 2011-11-16T16:00:00, [ex:param="a"])
Here s is the optional start identifier, ex:act2 is the identifier of the startingstarted activity, ar3:0111ex:trigger is the optional identifier offor the entity that triggered the activity start, ex:act1 is the optional identifier for the activity that generated the (possibly unspecified) entity ex:trigger, 2011-11-16T16:00:00 is the optional usagestart time, and [ex:param="a"] are optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasStartedBy(ex:act2, ar3:0111,-, ex:act1, -) wasStartedBy(ex:act2, ar3:0111,-, ex:act1, 2011-11-16T16:00:00) wasStartedBy(ex:act2, -, -, 2011-11-16T16:00:00) wasStartedBy(ex:act2, -, -) wasStartedBy(ex:act2, -, -, [ex:param="a"]) wasStartedBy(s,wasStartedBy(s; ex:act2, ar3:0111, 2011-11-16T16:00:00) wasStartedBy(-, ex:act2, ar3:0111,e, ex:act1, 2011-11-16T16:00:00)
Note: Even though the production startExpression
allows for expressions wasStartedBy(e2, -, -) and wasStartedBy(-,wasStartedBy(-; e2, -, -), these expressions are not valid in PROV-N,
since
at least one one of id, trigger, starter, time, orand attributes must be present.
[14] | endExpression ::=
wasEndedBy
| ::= | "wasEndedBy" "(" optionalIdentifier aIdentifier (
( identifier | - ) ,
"," eIdentifierOrMarker "," aIdentifierOrMarker "," timeOrMarker )? optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM End maps to a non-terminal.
wasEndedBy(s,wasEndedBy(s; ex:act2, ex:trigger, ex:trigger,ex:act3, 2011-11-16T16:00:00, [ex:param="a"])
Here s is the optional start identifier, ex:act2 is the identifier of the ending activity, ex:trigger is the identifier of the entity that triggered the activity end, ex:act3 is the optional identifier for the activity that generated the (possibly unspecified) entity e, 2011-11-16T16:00:00 is the optional usage time, and [ex:param="a"] are optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasEndedBy(ex:act2, ex:trigger, -, -) wasEndedBy(ex:act2, ex:trigger, -, 2011-11-16T16:00:00) wasEndedBy(ex:act2, -, -, 2011-11-16T16:00:00) wasEndedBy(ex:act2, -, -, 2011-11-16T16:00:00, [ex:param="a"]) wasEndedBy(e,ex:act2,wasEndedBy(e; ex:act2) wasEndedBy(e; ex:act2, ex:trigger, -, -) wasEndedBy(e, ex:act2, ex:trigger, 2011-11-16T16:00:00) wasEndedBy(-, ex:act2, ex:trigger, 2011-11-16T16:00:00)
Note:Even though the production endExpression
allows for expressions wasEndedBy(e2, -, -) and wasEndedBy(-,wasEndedBy(-; e2, -, -), these expressions are not valid in PROV-N,
since
at least one one of id, trigger, ender, time, and attributes must be present.
[15] | invalidationExpression ::=
wasInvalidatedBy
| ::= | "wasInvalidatedBy" "(" optionalIdentifier eIdentifier (
( identifier | - ) ,
"," aIdentifierOrMarker "," timeOrMarker )? optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Invalidation maps to a non-terminal.
wasInvalidatedBy(ex:i1,wasInvalidatedBy(ex:i1; tr:WD-prov-dm-20111215, ex:edit1, 2011-11-16T16:00:00, [ex:fct="save"])
Here ex:i1 is the optional invalidation identifier, tr:WD-prov-dm-20111215 is the identifier of the entity being invalidated, ex:edit1 is the optional identifier of the invalidating activity, 2011-11-16T16:00:00 is the optional invalidation time, and [ex:fct="save"] are optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasInvalidatedBy(tr:WD-prov-dm-20111215, ex:edit1, -) wasInvalidatedBy(tr:WD-prov-dm-20111215, ex:edit1, 2011-11-16T16:00:00) wasInvalidatedBy(e2, a1, -, [ex:fct="save"]) wasInvalidatedBy(e2, -, -, [ex:fct="save"]) wasInvalidatedBy(ex:i1,wasInvalidatedBy(ex:i1; tr:WD-prov-dm-20111215, ex:edit1, -) wasInvalidatedBy(-, tr:WD-prov-dm-20111215,wasInvalidatedBy(tr:WD-prov-dm-20111215, ex:edit1, -)
Even though the production invalidationExpression
allows for expressions wasInvalidatedBy(e2, -, -) and wasInvalidatedBy(-,wasInvalidatedBy(-; e2, -, -), these expressions are not valid in PROV-N, since
since at least one one of id, activity, time, orand attributes must be present.
[21] | derivationExpression ::=
wasDerivedFrom
| ::= | "wasDerivedFrom" "(" optionalIdentifier eIdentifier "," eIdentifier (
( identifier | - ) ,
"," aIdentifierOrMarker "," gIdentifierOrMarker "," uIdentifierOrMarker )? optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Derivation maps to a non-terminal.
wasDerivedFrom(d, e2, e1, a, g2, u1, [prov:comment="a[ex:comment="a righteous derivation"])
Here d is the optional derivation identifier, e2 is the identifier for the entity being derived, e1 is the identifier of the entity from which e2 is derived, a is the optional identifier of the activity which used/generated the entities, g2 is the optional identifier of the generation, u1 is the optional identifier of the usage, and [prov:comment="a[ex:comment="a righteous derivation"] are optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasDerivedFrom(e2, e1) wasDerivedFrom(e2, e1, a, g2, u1) wasDerivedFrom(e2, e1, -, g2, u1) wasDerivedFrom(e2, e1, a, -, u1) wasDerivedFrom(e2, e1, a, g2, -) wasDerivedFrom(e2, e1, a, -, -) wasDerivedFrom(e2, e1, -, -, u1) wasDerivedFrom(e2, e1, -, -, -) wasDerivedFrom(d, e2, e1, a, g2, u1) wasDerivedFrom(-, e2, e1, a, g2, u1)
wasRevisionOf(rev1, tr:WD-prov-dm-20111215, tr:WD-prov-dm-20111018, w3:Consortium, [prov:comment="??"] )wasDerivedFrom(d, e2, e1, a, g2, u1, [prov:type="prov:WasRevisionOf", ex:comment="a righteous derivation"])
Here Here, the derivation from Example 17 is extended with a rev1 is the optional revision identifier, prov:type attribute and value tr:WD-prov-dm-20111215 is the identifier of the revised entity, tr:WD-prov-dm-20111018 is the identifier of the original entity, w3:Consortium is the optional identifier of the agent involved in the revision, and [prov:comment="??"] ) are optional attributes. prov:WasRevisionOf.
The remaining examples show cases where some of the optionals are omitted. wasRevisionOf(tr:WD-prov-dm-20111215, tr:WD-prov-dm-20111018, -) wasRevisionOf(tr:WD-prov-dm-20111215, tr:WD-prov-dm-20111018, w3:Consortium) wasRevisionOf(id,tr:WD-prov-dm-20111215, tr:WD-prov-dm-20111018, w3:Consortium) wasRevisionOf(tr:WD-prov-dm-20111215, tr:WD-prov-dm-20111018, -) wasRevisionOf(id,tr:WD-prov-dm-20111215, tr:WD-prov-dm-20111018, -) wasRevisionOf(-,tr:WD-prov-dm-20111215, tr:WD-prov-dm-20111018, -)wasDerivedFrom(quoteId1; ex:blockQuote,ex:blog, ex:act1, ex:g, ex:u, [ prov:type='prov:WasQuotedFrom' ])
Here, the derivation is provided with a wasQuotedFrom prov:type attribute and value ( ( identifier | - ) , eIdentifier , eIdentifier , ( agIdentifier | - ) , ( agIdentifier | - ) optional-attribute-values ) prov:WasQuotedFrom.
wasQuotedFrom(quoteId1, ex:blockQuote,ex:blog,ex:Luc,ex:Paul,[])wasDerivedFrom(src1; ex:e1, ex:e2, ex:act, ex:g, ex:u, [ prove:type='prov:HadOriginalSource' ])
Here, the derivation is provided with a prov:type attribute and value prov:HadOriginalSource.
[17] | agentExpression | ::= | "agent" "(" identifier optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Agent maps to a non-terminal.
Agent | Non-Terminal |
id | identifier |
attributes | optionalAttributeValuePairs |
agent(ag4, [ prov:type="prov:Person", ex:name="David" ])
Here ag is the agent identifier, and [ prov:type="prov:Person", ex:name="David" ] are optional attributes.
In the next example, the optional attributes are omitted.agent(ag4)
[18] | attributionExpression | ::= | "wasAttributedTo" "(" optionalIdentifier eIdentifier "," agIdentifier optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Attribution maps to a non-terminal.
Attribution | Non-Terminal |
id | optionalIdentifier |
entity | eIdentifier |
agent | agIdentifier |
attributes | optionalAttributeValuePairs |
wasAttributedTo(id; e, ag, [ex:license='cc:attributionURL' ])
Here id is the optional attribution identifier, e is an entity identifier, ag is the identifier of the agent to whom the entity is abscribed, and [ex:license='cc:attributionURL' ] are optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasAttributedTo(e, ag)
wasAttributedTo(e, ag, [ex:license='cc:attributionURL' ])
[19] | associationExpression | ::= | "wasAssociatedWith" "(" optionalIdentifier aIdentifier "," agIdentifierOrMarker ( "," eIdentifierOrMarker )? optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Association maps to a non-terminal.
Association | Non-Terminal |
id | optionalIdentifier |
activity | aIdentifier |
agent | agIdentifierOrMarker |
plan | eIdentifierOrMarker |
attributes | optionalAttributeValuePairs |
wasAssociatedWith(ex:agas; ex:a1, ex:ag1, ex:e1, [ex:param1="a", ex:param2="b"])
Here quoteId1ex:agas is the optional revisionattribution identifier, ex:blockQuoteex:a1 is the identifier of the entity that represents the quote (the partial copy)an activity identifier, ex:blogex:ag1 is the optional identifier of the original entity being quoted,agent associated to the activity, ex:Lucex:e1 is the optional identifier of the plan used by the agent who performs the quoting, ex:Paul is the optional identifierin the context of the agent to whom the original entity is attributed,activity, and [] is the (empty) [ex:param1="a", ex:param2="b"] are optional setattributes.
The remaining examples show cases where some of attributes. the optionals are omitted.wasAssociatedWith(ex:a1, -, ex:e1)
wasAssociatedWith(ex:a1, ex:ag1)
wasAssociatedWith(ex:a1, ex:ag1, ex:e1)
wasAssociatedWith(ex:a1, ex:ag1, ex:e1, [ex:param1="a", ex:param2="b"])
wasAssociatedWith(a; ex:a1, -, ex:e1)
Note:The production associationExpression
allows for expressions wasAssociatedWith(a, -, -) and wasAssociatiedWith(-; a, -, -). However, these expressions are not valid in PROV-N,
since
at least one of id, agent, plan, and attributes must be present.
[20] | delegationExpression | ::= | "actedOnBehalfOf" "(" optionalIdentifier agIdentifier "," agIdentifier (
( identifier | - ) ,
eIdentifier
,
eIdentifier
optional-attribute-values
)
"," aIdentifierOrMarker )? optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Delegation maps to a non-terminal.
Delegation | Non-Terminal |
id | optionalIdentifier |
delegate | agIdentifier |
responsible | agIdentifier |
activity | aIdentifierOrMarker |
attributes | optionalAttributeValuePairs |
hadOriginalSource(src1, ex:e1, ex:e2,[ex:param="a"])actedOnBehalfOf(del1; ag2, ag1, a, [prov:type="contract"])
Here src1del1 is the optional original sourcedelegation identifier, ex:e1ag2 is the identifier of the derived entity,for the delegate agent, ex:e2ag1 is the identifier of the original source entity,responsible agent, a is the optional identifier of the activity for which the delegation link holds, and [ex:param="a"] is the[prov:type="contract"] are optional setattributes.
The remaining examples show cases where some of attributes. The remaining examples show cases where some of the optionals are omitted.hadOriginalSource(ex:e1, ex:e2) hadOriginalSource(ex:e1, ex:e2,[ex:param="a"]) hadOriginalSource(-,ex:e1, ex:e2,[ex:param="a"]) hadOriginalSource(-,ex:e1, ex:e2)actedOnBehalfOf(ag1, ag2) actedOnBehalfOf(ag1, ag2, a) actedOnBehalfOf(ag1, ag2, -, [prov:type="delegation"]) actedOnBehalfOf(ag2, ag3, a, [prov:type="contract"]) actedOnBehalfOf(r; ag2, ag3, a, [prov:type="contract"])
[22] | tracedToExpression | ::= | "tracedTo" "(" optionalIdentifier eIdentifier
,
"," eIdentifier
optional-attribute-values
) optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Trace maps to a non-terminal.
Trace | Non-Terminal |
id | optionalIdentifier |
entity | eIdentifier |
ancestor | eIdentifier |
attributes | optionalAttributeValuePairs |
tracedTo(id,e2,e1,[ex:param="a"])
Here id is the optional trace identifier, e2 is an entity identifier, e1 is the identifier for an ancestor entity that e2 depends on, and [ex:param="a"] is the optional set of attributes.
The remaining examples show cases where some of the optionals are omitted.tracedTo(e2,e1) tracedTo(e2,e1,[ex:param="a"]) tracedTo(-,e2,e1)tracedTo(id; e2,e1)
[33] | namedBundle | ::= | "bundle" identifier (namespaceDeclarations)? (expression)* "endBundle" |
bundle ex:author-view
agent(ex:Paolo, [ prov:type='prov:Person' ])
agent(ex:Simon, [ prov:type='prov:Person' ])
...
endBundle
Here ex:author-view is the name of the bundle.
[23] | alternateExpression ::=
alternateOf
(
| ::= | "alternateOf" "(" eIdentifier
,
"," eIdentifier
)
")" |
The following table summarizes how each constituent of a PROV-DM Alternate maps to a non-terminal.
Alternate | Non-Terminal |
alternate1 | eIdentifier |
alternate2 | eIdentifier |
alternateOf(tr:WD-prov-dm-20111215,ex:alternate-20111215)Here tr:WD-prov-dm-20111215 is alternate for ex:alternate-20111215.
[24] | specializationExpression ::=
specializationOf
(
| ::= | "specializationOf" "(" eIdentifier
,
"," eIdentifier
)
")" |
The following table summarizes how each constituent of a PROV-DM Specialization maps to a non-terminal.
Specialization | Non-Terminal |
specificEntity | eIdentifier |
generalEntity | eIdentifier |
specializationOf(tr:WD-prov-dm-20111215,tr:prov-dm)Here tr:WD-prov-dm-20111215 is a specialization of tr:prov-dm.
[25] | contextualizationExpression | ::= | "contextualizationOf" "(" identifier "," identifier "," bIdentifier ")" |
The following table summarizes how each constituent of a PROV-DM Contextualization maps to a non-terminal.
Contextualization | Non-Terminal |
local | eIdentifier |
contextualized | eIdentifier |
context | bIdentifier |
contextualization(ex:report1_as_in_b1, ex:report1, ex:b1)
Here ex:report1_as_in_b1 is an entity identifier, ex:report1 is an entity identifier, ex:b1 is the identifier for a bundle
[26] | insertionExpression | ::= | "derivedByInsertionFrom" "(" optionalIdentifier dIdentifier "," dIdentifier "," keyEntitySet optionalAttributeValuePairs ")" |
[27] | keyEntitySet | ::= | "{" "(" literal "," identifier ")" (
"," "(" literal
,
eidentifier
)
|
(
literal
,
eidentifier
)
,
keyValuePairs
keySet ::=
literal
|
literal
,
keySet "," identifier ")" )* "}" |
The following table summarizes how each constituent of a PROV-DM Insertion maps to a non-terminal.
Insertion | Non-Terminal |
id | optionalIdentifier |
after | cIdentifier |
before | cIdentifier |
key-entity-set | keyEntitySet |
attributes | optionalAttributeValuePairs |
derivedByInsertionFrom(id,derivedByInsertionFrom(id; c1, c, {("k1", v1), ("k2", v2)}, [])
Here id is the optional insertion identifier, c1 is the identifier for the collection after the insertion, c is the identifier for the collection before the insertion, {("k1", v1), ("k2", v2)} is the set of key-value pairs that have been inserted in c, and [] is the optional (empty) set of attributes.
The remaining examples show cases where some of the optionals are omitted.derivedByInsertionFrom(c1, c, {("k1", v1), ("k2", v2)}) derivedByInsertionFrom(c1, c, {("k1", v1)}) derivedByInsertionFrom(c1, c, {("k1", v1), ("k2", v2)}, [])
[29] | removalExpression | ::= | "derivedByRemovalFrom" "(" optionalIdentifier dIdentifier "," dIdentifier "," keySet optionalAttributeValuePairs ")" |
[30] | keySet | ::= | "{" literal (
identifier
,
"," literal )* "}" |
The following table summarizes how each constituent of a PROV-DM Removal maps to a non-terminal.
Removal | Non-Terminal |
id | optionalIdentifier |
after | cIdentifier
,
|
before | cIdentifier
,
{
|
key-set | keySet
}
optional-attribute-values
)
|
attributes | optionalAttributeValuePairs |
derivedByRemovalFrom(id,derivedByRemovalFrom(id; c3, c, {"k1", "k3"}, [])
Here id is the optional removal identifier, c1 is the identifier for the collection after the removal, c is the identifier for the collection before the removal, {("k1", v1), ("k2", v2)} is the set of key-value pairs that have been removed from c, and [] is the optional (empty) set of attributes.
The remaining examples show cases where some of the optionals are omitted.derivedByRemovalFrom(c3, c1, {"k1", "k3"}) derivedByRemovalFrom(c3, c1, {"k1"}) derivedByRemovalFrom(c3, c1, {"k1", "k3"}, [])
[31] | membershipExpression ::=
isMemberOf
| ::= | "memberOf" "(" optionalIdentifier dIdentifier "," keyEntitySet complete optionalAttributeValuePairs ")" |
[32] | complete | ::= | (
identifier
,
"," ( "true" | "false" | "-" ) )? |
[28] | entitySet | ::= | "{" (eIdentifier)* "}" |
The following table summarizes how each constituent of a PROV-DM Membership maps to a non-terminal.
Collection Membership | Non-Terminal |
id | optionalIdentifier |
collection | cIdentifier
,
{
keyValuePairs
}
optional-attribute-values
)
|
entity-set | entitySet |
complete | complete |
attributes | optionalAttributeValuePairs |
Dictionary Membership | Non-Terminal |
id | optionalIdentifier |
dictionary | dIdentifier |
key-entity-set | keyEntitySet |
complete | complete |
attributes | optionalAttributeValuePairs |
memberOf(mid,memberOf(mId, c, {e1, e2, e3}, []) // Collection membership memberOf(mId, c, {("k4", v4), ("k5", v5)}, []) // Dictionary membership
Here mid is the optional membership identifier, c is the identifier for the collection whose membership is stated, {("k4", v4), ("k5", v5)} is the set of key-value pairs that are members of c, and [] is the optional (empty) set of attributes.
The remaining examples show cases for Dictionaries, where some of the optionals are omitted. Key-entity sets are replaced with Entity sets for the corresponding generic Collections examples.memberOf(c3, {("k4", v4), ("k5", v5)}) memberOf(c3, {("k4", v4)}) memberOf(c3, {("k4", v4)}, false) memberOf(c3, {("k4", v4)}, true) memberOf(c3, {("k4", v4), ("k5", v5)},[]) memberOf(c3, {("k4", v4), ("k5", v5)},true, [])
[54] | namespaceDeclarations | ::= | ( defaultNamespaceDeclaration | namespaceDeclaration ) (namespaceDeclaration)* |
[55] | namespaceDeclaration | ::= | "prefix" QUALIFIED_NAME namespace |
[56] | defaultNamespaceDeclaration | ::= | "default" IRI_REF |
[63] | <IRI_REF > | ::= | "<" ( [^<>"{}|^`\] - [#0000- ] | UCHAR )* ">" |
In PROV-N, the following prefixes are reserved:
A PROV-N document must not redeclare prefixes prov and xsd.
The following example declares three namespaces, one default, and two with explicit prefixes ex1 and ex2.
containerbundle default <http://example.org/0/> prefix ex1 <http://example.org/1/> prefix ex2 <http://example.org/2/> ... endendBundle
A qualified name is a name subject to namespace interpretation. It consists of a namespace, denoted by an optional prefix, and a local name. The PROV data model stipulates that a qualified name can be mapped into an IRI by concatenating the IRI associated with the prefix and the local part.
A qualified name's prefix is optional. If a prefix occurs in a qualified name, it refers to a namespace declared in a namespace declaration. In the absence of prefix, the qualified name refers to the default namespace.
identifier ::= qualifiedName eIdentifier ::= identifier (intended to denote an entity) aIdentifier ::= identifier (intended to denote an activity) agIdentifier ::= identifier (intended to denote an agent) gIdentifier::= identifier (intended to denoteA PROV qualified name QUALIFIED_NAME
has a generation)
uIdentifier::= identifier (intended to denote a usage)
nIdentifier::= identifier (intended to denote a note)
cIdentifier::= identifier (intended to denote a collection)
qualifiedName ::=
prefix :
localPart | : localPart
prefix ::= a name without colon compatible with the NC_NAME productionmore permissive syntax then XML's QName
[XML-NAMES]
localPart ::= a name compatible with the
reference production
and SPARQL PrefixedName
[RDFA-CORERDF-SPARQL-QUERY]
A]. It is a PROV qualified name has a more permissive syntax then XML's QName [XML-NAMES] since it allows any syntax for itsrequirement that the concatenation of the namespace with the local part provided that the concatenation with the namespace results in a valid IRI [IRI]. Given that
',' (comma),
';' (semi-colon),
'=' (equal),
'(' (left bracket),
')' (right bracket),
'[' (left square bracket),
']' (right square bracket) are used by the PROV notation as delimiters, they are not allowed in local parts. Instead, they can be %-escaped or incorporated in the IRI denoted by a prefix.
Qualified names QUALIFIED_NAME
consist of a prefix and a local part. Prefixes follow the production PN_PREFIX
defined by SPARQL [RDF-SPARQL-QUERY]. Local parts have to be conformant with PN_LOCAL
, which extends the original SPARQL PN_LOCAL
definition by allowing further characters and %-escaped charaters (see PN_CHARS_OTHERS
).
[34] | eIdentifier | ::= | identifier |
[35] | aIdentifier | ::= | identifier |
[36] | agIdentifier | ::= | identifier |
[37] | gIdentifier | ::= | identifier |
[38] | uIdentifier | ::= | identifier |
[40] | cIdentifier | ::= | identifier |
[39] | dIdentifier | ::= | identifier |
[41] | bIdentifier | ::= | identifier |
[42] | eIdentifierOrMarker | ::= | ( eIdentifier | "-" ) |
[43] | aIdentifierOrMarker | ::= | ( aIdentifier | "-" ) |
[44] | agIdentifierOrMarker | ::= | ( agIdentifier | "-" ) |
[45] | gIdentifierOrMarker | ::= | ( gIdentifier | "-" ) |
[46] | uIdentifierOrMarker | ::= | ( uIdentifier | "-" ) |
[47] | identifier | ::= | QUALIFIED_NAME |
[58] | <QUALIFIED_NAME > | ::= | ( PN_PREFIX ":" )? PN_LOCAL |
[76] | <PN_LOCAL > | ::= | ( PN_CHARS_U | [0-9] | PN_CHARS_OTHERS ) ( ( PN_CHARS | "." | PN_CHARS_OTHERS )* ( PN_CHARS | PN_CHARS_OTHERS ) )? |
[77] | <PN_CHARS_OTHERS > | ::= | PERCENT |
[78] | <PERCENT > | ::= | "%" HEX HEX |
[79] | <HEX > | ::= | [0-9] |
Examples of articles on the BBC Web site seen as entities.
containerbundle prefix bbc <http://www.bbc.co.uk/> prefix bbcNews <http://www.bbc.co.uk/news/> entity(bbc:) // bbc site itself entity(bbc:news/) // bbc news entity(bbc:news/world-asia-17507976) // a given news article entity(bbcNews:) // an alternative way of referring to the bbc news site endendBundle
Examples of entities with declared and default namespace.
containerbundle default <http://example.org/2/> prefix ex <http://example.org/1/> entity(ex:a) // corresponds to IRI http://example.org/1/a entity(ex:a/) // corresponds to IRI http://example.org/1/a/ entity(ex:a/b) // corresponds to IRI http://example.org/1/a/b entity(b) // corresponds to IRI http://example.org/2/b entity(ex:1234) // corresponds to IRI http://example.org/2/1234 entity(4567) // corresponds to IRI http://example.org/2/4567 endendBundle
Note:The productions for qualifiedName
and prefix
are conflicting. In the context of a namespaceDeclaration
, a parser should give precedence to the production for prefix
.
[48] | attribute ::= qualifiedName
| ::= | QUALIFIED_NAME |
The reserved attributes in the PROV namespace are the following.
[50] | literal | ::= | typedLiteral |
[51] | typedLiteral | ::= | STRING_LITERAL "%%" datatype |
[52] | datatype | ::= | QUALIFIED_NAME |
[53] | convenienceNotation | ::= | STRING_LITERAL |
[64] | <STRING_LITERAL > | ::= | '"' ( ( [^"\
] ) | ECHAR | UCHAR )* '"' |
[65] | <INT_LITERAL > | ::= | ("-")? (DIGIT)+ |
[66] | <QUALIFIED_NAME_LITERAL > | ::= | "'" QUALIFIED_NAME "'" |
In production prod-datatype
, the QUALIFIED_NAME
is used to denote a finite sequence of characters in which " (#x22) and \ (#x5C) occur only in pairs of the form \" (#x5C, #x22) and \\ (#x5C,
#x5C), enclosed in a pair of " (#x22) characters
intLiteral ::= a finite-length non-empty sequence of decimal digits (#x30-#x39) with an optional leading negative sign (-)
PROV data type [PROV-DM].
The non terminals stringLiteral
STRING_LITERAL
,
INT_LITERAL
, and
intLiteralQUALIFIED_NAME_LITERAL
are syntactic sugar for quoted strings with datatype
xsd:string and
,
xsd:int
, and
prov:QUALIFIED_NAME
respectively.
In particular, a Literal may be an IRI-typed string (with datatype xsd:anyURI); such IRI has no specific interpretation in the context of PROV.
Permitted datatypes in literals xsd:decimal xsd:double xsd:dateTime xsd:integer xsd:float xsd:nonNegativeInteger xsd:string rdf:XMLLiteral xsd:nonPositiveIntegerxsd:normalizedString xsd:positiveInteger xsd:token xsd:negativeInteger xsd:language xsd:long xsd:Name xsd:int xsd:NCName xsd:short xsd:NMTOKEN xsd:byte xsd:boolean xsd:unsignedLong xsd:hexBinary xsd:unsignedInt xsd:base64Binary xsd:unsignedShortxsd:anyURI xsd:unsignedByte xsd:QNameNote:The productions for qualifiedNameprov:QUALIFIED_NAME
and intLiteral
INT_LITERAL
are conflicting. In the context of a Literal
literal
, a parser should give precedence to the production for intLiteral
INT_LITERAL
.
The reserved type values in the PROV namespace are the following.
The entityagent ag is a person (type: prov:Person), whereas the entity pl is a plan (type: prov:Plan).
agent(ag,[prov:type="prov:Person" %% xsd:QName]) entity(pl,[prov:type="prov:Plan" %% xsd:QName])agent(ag, [ prov:type='prov:Person' ]) entity(pl, [ prov:type='prov:Plan' ])
Time instants are defined according to xsd:dateTime [XMLSCHEMA-2].
[49] | time | ::= | ISODATETIME |
[60] | <DIGIT > | ::= | [0-9] |
[61] | <ISODATETIME > | ::= | DIGIT DIGIT DIGIT DIGIT "-" DIGIT DIGIT "-" DIGIT DIGIT "T" DIGIT DIGIT ":" DIGIT DIGIT ":" DIGIT DIGIT ( "." DIGIT ( DIGIT (DIGIT)? )? )? ( "Z" | TIMEZONEOFFSET )? |
[62] | <TIMEZONEOFFSET > | ::= | ( "+" | "-" ) DIGIT DIGIT ":" DIGIT DIGIT |
The third argument in the following usage expression is a time instance, namely 4pm on 2011-11-16.
used(ex:act2, ar3:0111, 2011-11-16T16:00:00)
An expression containerA toplevel bundle is a house-keeping construct of PROV-N capable of packaging up PROV-N expressions and namespace declarations. An expression containerA toplevel bundle forms a self-contained package of provenance descriptions for the purpose of exchanging them. An expression containerA toplevel bundle may be used to package up PROV-N expressions in response to a request for the provenance of something ([PROV-AQ]).
Given its status of house keeping construct for the purpose of exchanging provenance expressions, ana toplevel bundle is not defined as a PROV-N expression container is not defined as a PROV-N (production expression (production expression
).
An expression container,A toplevel bundle, written containerbundle decls exprs endContainerbundles endBundle in PROV-N, contains:
An A toplevel bundle's text matches the bundle production.
[1] | bundle | ::= | "bundle" (namespaceDeclarations)? (expression container's text matches the expressionContainer production.
expressionContainer ::=
container
namespaceDeclarations
expression
endContainer
)* (namedBundle)* "endBundle" |
The following container contains expressions related to the provenance of entity e2.
containerbundle default <http://anotherexample.org/> prefix ex <http://example.org/> entity(e2, [ prov:type="File", ex:path="/shared/crime.txt", ex:creator="Alice", ex:content="There was a lot of crime in London last month."]) activity(a1, 2011-11-16T16:05:00, -,[prov:type="edit"]) wasGeneratedBy(e2, a1, -, [ex:fct="save"]) wasAssociatedWith(a1, ag2, -, [prov:role="author"]) agent(ag2, [ prov:type="prov:Person" %% xsd:QName,prov:type='prov:Person' , ex:name="Bob" ]) endContainerendBundle
This container could for instance be returned as the result of a query to a provenance store for the provenance of entity e2 [PROV-AQ].
The media type of PROV-N is text/prov-n. The content encoding of PROV-N content is UTF-8.
See http://www.w3.org/2002/06/registering-mediatype for Register an Internet Media Type for a W3C Spec.
WG membership to be listed here.
2.5 Comments
Comments in PROV-N take two forms:IRI_REF
orSTRING_LITERAL
; such cooments continue to the end of line (marked by characters U+000D or U+000A) or end of file if there is no end of line after the comment marker.IRI_REF
orSTRING_LITERAL
.Comments are treated as white space.