W3C

Semantics of the PROV Data Model

W3C Editor's Draft 25 February 2013

This version:
http://dvcs.w3.org/hg/prov/raw-file/default/semantics/prov-sem.html
Latest published version:
http://www.w3.org/TR/prov-sem/
Latest editor's draft:
http://dvcs.w3.org/hg/prov/raw-file/default/semantics/prov-sem.html
Editor:
James Cheney, University of Edinburgh

Abstract

Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. PROV-DM is the conceptual data model that forms a basis for the W3C provenance (PROV) family of specifications.

This document presents a model-theoretic semantics for the PROV data model (called the reference semantics), viewing PROV-DM statements as atomic formulas in the sense of first-order logic, and viewing the constraints and inferences specified in PROV-CONSTRAINTS as a first-order theory. It is shown that the first-order theory is sound with respect to the reference semantics. This information may be useful to researchers or users of PROV to understand the intended meaning and use of PROV for modeling information about the actual history, derivation or evolution of Web resources. It may also be useful for development of additional constraints or inferences for reasoning about PROV or integration of PROV with other Semantic Web vocabularies. It is not proposed as a canonical or required semantics of PROV and does not place any constraints on use of PROV.

The PROV Document Overview describes the overall state of PROV, and should be read before other PROV documents.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

PROV Family of Documents

This document is part of the PROV family of documents, a set of documents defining various aspects that are necessary to achieve the vision of inter-operable interchange of provenance information in heterogeneous environments such as the Web. These documents are listed below. Please consult the [PROV-OVERVIEW] for a guide to reading these documents.

This document was published by the Provenance Working Group as an Editor's Draft. If you wish to make comments regarding this document, please send them to public-prov-comments@w3.org (subscribe, archives). All comments are welcome.

Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1. Introduction

Provenance is a record that describes the people, institutions, entities, and activities involved in producing, influencing, or delivering a piece of data or a thing. This document complements the PROV-DM specification [PROV-DM] that defines a data model for provenance on the Web, and the PROV-CONSTRAINTS specification [PROV-CONSTRAINTS] that specifies definitions, inferences, and constraints that can be used to reason about PROV documents, or determine their validity. This document provides a reference formal semantics of PROV, providing a formal counterpart to the informal descriptions and motivations given elsewhere in PROV specifications.

1.1 Purpose of this document

The PROV-DM and PROV-CONSTRAINTS give motivating examples that provide an intuition about the meaning of the constructs. For some concepts, such as use, start, end, generation/invalidation, and derivation, the meaning is either obvious or situation-dependent. However, during the development of PROV, the importance of additional concepts became evident, but the intuitive meaning or correct use of these concepts were not clear. For example, the alternateOf and specializationOf relations are used in PROV to relate different entities that present aspects of "the same thing". Over time the working group came to a consensus about these concepts and how they are to be used, but this understanding is based on abstract notions that are not explicit in PROV documents; instead, some of their properties are captured formally through certain constraints and inferences, while others are not captured in PROV specifications at all.

The purpose of this document is to present the working group's consensus view of the semantics of PROV, using tools from mathematical logic, principally model theory (though our use of these tools is lightweight). This information may be useful to users for understanding the intent behind certain features of PROV, to researchers investigating richer forms of reasoning over provenance, or to future efforts building upon PROV. It is intended as an exploration of a reference semantics for PROV, not a definitive specification of the only semantics of PROV.

Nevertheless, the reference semantics does have appealing properties. Specifically, it provides a declarative counterpart to the operational definition of validity taken in PROV-CONSTRAINTS. In the specification, validity is defined via a normalization process followed by constraint checking on the normal form. This approach was adopted to keep the specification closer to implementations, although other implementations are possible and allowed. In addition to providing a reference semantics, this document shows that the operational presentation of PROV validity checking is sound with respect to declarative presentation adopted here. This could help justify alternative approaches to validity checking.

1.2 Structure of this document

Note
Completeness direction above doesn't hold for current semantics, because attributes assumed to be single-valued; for example, entity(id,[a=1,a=2]) is valid but doesn't have a reference model.

1.3 Audience

This document assumes familiarity with [PROV-DM] and [PROV-CONSTRAINTS] and employs (a simplified form of) [PROV-N] notation. In particular it assumes familiarity with the concepts from logic, and the relationship between PROV statements and instances and first-order formulas and theories, respectively, presented in Section 2.5 of PROV-CONSTRAINTS.

This document may be useful to users of PROV who have a formal background and are interested in the rationale for some of the constructs of PROV; for resaerchers investigating extensions of PROV or alternative approaches to reasoning about PROV; or for future efforts on provenance standardization.

2. Basics

2.1 Identifiers

A lowercase symbol x,y,... on its own denotes an identifier. Identifiers may or may not be URIs. Identifiers are viewed as variables in logic (or blank nodes in RDF): just because we have two different identifiers x and y doesn't tell us that they denote different things, since we could discover that they are actually the same later. We write Identifiers for the set of identifiers of interest in a given situation (typically, the set of identifiers present in the PROV instance of interest).

2.2 Times and Intervals

We assume a linearly ordered set (Times,) of time instants. For convenience we assume the order is total or linear order, corresponding to a linear timeline; however, PROV does not assume that time is linear and events could be partially ordered and not necessarily reconciled to a single global clock.

We also consider a set Intervals of closed intervals of the form {tt1tt2}.

2.3 Attributes and Values

We assume a set Attributes of attribute labels and a set Values of possible values of attributes. To allow for the fact that some attributes can have undefined or multiple values, we sometimes use the set P(Value), that is, the set of sets of values.

2.4 Atomic Formulas

The following atomic formulas correspond to the statements of PROV-DM. We assume that definitions 1-4 of PROV-CONSTRAINTS have been applied in order to expand all optional parameters; thus, we use uniform notation r(id,a1,,an) instead of the semicolon notation r(id;a1,,an).

Each parameter is either an identifier, a constant (e.g. a time or other literal value in an attribute list), or a null symbol "-". Null symbols can only appear in the specified arguments in wasAssociatedWith and wasDerivedFrom, as shown in the grammar below.

atomic_formulaelement_formularelation_formulaauxiliary_formulaattrsty::=||::=||::=|||||||||||||||::=||||::=::=||||element_formularelation_formulaauxiliary_formulaentity(id,attrs)activity(id,st,et,attrs)agent(id,attrs)wasGeneratedBy(id,e,a,t,attrs)used(id,e,a,t,attrs)wasInvalidatedBy(id,e,a,t,attrs)wasStartedBy(id,a2,e,a1,attrs)wasEndedBy(id,a2,e,a1,attrs)wasAssociatedWith(id,ag,act,pl,attrs)wasAssociatedWith(id,ag,act,,attrs)wasAttributedTo(id,e,ag,attrs)actedOnBehalfOf(if,ag2,ag1,act,attrs)wasInformedBy(id,a2,a1,attrs)wasDerivedFrom(id,e2,e1,act,g,u,attrs)wasDerivedFrom(id,e2,e1,,,,attrs)wasInfluencedBy(id,x,y,attrs)alternateOf(e1,e2)specializationOf(e1,e2)hadMember(c,e)x=yxyxynotNull(x)typeOf(x,ty)[attr1=v1,,attrn=vn]entityactivityagentprov:Collectionprov:EmptyCollection

2.5 First-Order Formulas

We also consider the usual connectives and quantifiers of first-order logic [Logic].

ϕ::=||||||||atomic_formulaTrueFalse¬ ϕϕ1ϕ2ϕ1ϕ2ϕ1ϕ2x.ϕx.ϕ

3. Structures and Interpretations

3.1 Things

Note
TODO: Containment of things / collections? (for hadMember).

Things are things in the world. Each thing has a lifetime during which it exists and attributes whose values can change over time.

To model this, a structure W includes:

  1. a set Things of things
  2. a function lifetime:ThingsIntervals from objects to time intervals
  3. a function value:Things×Attributes×TimesP(Values)

The range of value is the set P(Values), indicating that value is essentially a multi-valued that returns a set of values (possibly empty). When value(x,a,t)=, we say that attribute a is undefined for x at time t.

Note that this description does not say what the structure of a Thing is, only how it may be described in terms of its time interval and attribute values. An object could just be a record of fixed attribute values; it could be a bear; it could be the Royal Society; it could be a transcendental number like π. All that matters from our point of view is that we know how to map the Thing to its time interval and attribute mapping.

It is possible for two Things to be indistinguishable by their attribute values and lifetime, but have different identity.

3.2 Objects

An Object is described by a time interval and attributes with unchanging values. Objects encompass entities, interactions, and activities. To model this, a structure includes:

  1. a set Objects
  2. a function lifetime:ObjectsIntervals from objects to time intervals
  3. a function value:Objects×AttributesP(Values)

Intuitively, lifetime(e) is the time interval during which object e exists. The set value(e,a) is the set of values of attribute a during the object's lifetime.

As with Things, the range of value is sets of values, making value effectively a multivalued function. It is also possible to have two different objects that are indistinguishable by their attributes and time intervals. Objects are not things, and the sets of Objects and Things are disjoint; however, certain objects, namely entities, are linked to things.

3.2.1 Entities

An entity is a kind of object that describes a time-slice of a thing, during which some of the thing's attributes are fixed. We assume:
  1. a set EntitiesObjects of entities, disjoint from Activities and Events below.
  2. a function thingOf:EntitiesThings that associates each Entity with a Thing, such that for each tlifetime(obj), and for each attribute a we have value(obj,a)value(thingOf(obj),a,t).
  3. lifetime(e)lifetime(t).

Although both entities and things can have undefined or multiple attribute values, their meaning is slightly different: for a thing, value(x,a,t)= means that the attribute a has no value at time t, whereas for an entity, value(x,a)= only means that the entity does not record a fixed value for a forthe associated thing. This does not imply that value(thingOf(e),a,t)= when tlifetime(e).

Furthermore, all of the attribute values of the entity must be present in the associated thing throughout the lifetime of the entity. For example, suppose value(thingOf(e),a,t) is {1} at some time in lifetime(e) and value(thingOf(e),a,t)={2} at some other time t. Then value(e,a) must be because there is no other set of values that is simultaneously contained in both {1} and {2}.

3.2.1.1 Plans

We identify a specific subset of the entities called plans:

A set PlansEntities of plans.

3.2.2 Actvities

An activity is an object that encompasses a set of events. We introduce

  1. A set ActivitiesObjects of activities.
  2. Activities are disjoint from Entities: EntitiesActivities=.

3.2.3 Agents

An agent is an object that can act, by controlling, starting, ending, or participating in activities. Agents can act on behalf of other agents. An agent can be an entity, an activity, or neither; an agent cannot be both entity and activity because the sets of entities and activities are disjoint. We introduce:

A set AgentsObjects of agents.

3.2.4 Interactions

We consider a set InteractionsObjects which are split into Events connecting entities and activities, Associations between agents and activities, Communications between pairs of activities, Delegations between pairs of agents, and Derivations that describe chains of generation and usage steps. (The first two sets may overlap.) Interactions are disjoint from entities, activities and agents.

  1. A set Interactions=EventsAssociationsCommunicationsDelegationsDerivationsObjects
  2. A function type:Interactions{start,end,usage,generation,invalidation,derivation,version,quotation,primarySource,attribution,delegation}.
  3. The sets Events, Associations, Communications, Delegations and Derivations are all disjoint.
  4. Interactions are disjoint from entities, agents and activities: Interactions(EntitiesActivitiesAgents)=
3.2.4.1 Events

An Event is an interaction whose lifetime is a single time instant, and relates an activity to an entity (which could be an agent). Events have types including usage, generation, invalidation, starting and ending. Events are instantaneous. We introduce:

  1. A set EventsInteractions of events, such that type(evt){start,end,generation,usage,invalidation} if and only of evtEvents.
  2. A function time:EventsTimes giving the time of each event; i.e. lifetime(evt)={time(t)}.
  3. The derived ordering on events given by evt1evt2time(evt1)time(evt2)
3.2.4.2 Associations

An Association is an interaction relating an agent to an activity. To model associations, we introduce:

A set AssociationsInteractions, such that type(assoc)=association if and only if assocAssociations.

Associations are used below in the ActsFor and AssociatedWith relations.

3.2.4.3 Communications
Note
TODO
3.2.4.4 Delegations
Note
TODO
3.2.4.5 Derivations

A Derivation is an interaction chaining one or more generation and use steps.

A set DerivationsInteractions, such that type(deriv){derivation,version,primarySource,quotation} if and only if derivDerivations.

See below for the associated derivation path and DerivedFrom relation.

3.3 Relations

Simple relations

The entities, interactions, and activities in a structure are related in the following ways:

  1. A relation UsedEvents×Entities saying when an event used an entity. An event can use at most one entity, and if (evt,e)Used then time(evt)lifetime(e) and type(g)=use must hold.
  2. A relation GeneratedEvents×Entities saying when an event generated an entity. An event can generate at most one entity, and if (evt,e)Generated then min(lifetime(e))=time(evt) and type(g)=generation must hold.
  3. A relation InvalidatedEvents×Entities saying when an event invalidated an entity. An event can invalidate at most one entity, and if (evt,e)Invalidated then min(lifetime(e))=time(evt) and type(g)=invalidation must hold.
  4. A relation EventActivityEvents×Activities associating activities with events, such that (act,evt)EventActivity implies time(evt)lifetime(act).
  5. A relation AssociatedWithAssociation×Agents×Activities×Plans indicating when an agent is associated with an activity, and giving the identity of the association relationship, and an optional plan.
  6. A relation ActsForDelegations×Agents×Agents×Activities indicating when one agent acts on behalf of another with respect to a given activity.
Note
TODO: Communication, start, end relations
Note
TODO: Specialization relation
Note
TODO: Explicit axioms concerning the relations

3.3.1 Derivation paths and DerivedFrom

Recall that above we introduced a subset of interactions called Derivations. These identify paths of the form

entngnactnunentn1...ent1g1act1u1ent0

where the enti are entities, acti are activities, gi are generations, and ui are usages.

Formally, we consider the (regular) language:

DerivationPaths=Entities(EventsActivitiesEventsEntities)+

with the constraints that for each derivation path:

  • for each substring entgact we have (g,ent)Generated and (g,act)EventActivities, and
  • for each substring actuent we have (u,ent)Used and (u,act)EventActivities.

We also consider a function derivedFrom:DerivationsDerivationPaths linking each derivation to its path.

The reason why we need paths and not just individual derivation steps is that imprecise wasDerivedFrom formulas can represent multiple derivation steps.

3.4 Putting it all together

A structure W is a structure containing all of the above described data. If we need to talk about the objects or relations of more than one structure then we may write W1.Objects, W1.Things, etc.; otherwise, to decrease notational clutter, when we consider a fixed structure then the names of the sets, relations and functions above refer to the components of that model.

3.5 Interpretations

We need to link identifiers to the objects they denote. We do this using a function which we shall call an interpretation. Thus, we consider interpretations as follows: An interpretation is a function ρ:IdentifiersObjects describing which object is the target of each identifier. The mapping from identifiers to objects may not change over time.

4. Semantics

In what follows, let W be a fixed structure with the associated sets and relations discussed in the previous section, and let I be an interpretation of identifiers as objects in W. The annotations [WF] refer to well-formedness constraints that correspond to typing constraints.

4.1 Satisfaction

Consider a formula ϕ, a structure W and an interpretation I. We define notation W,ρϕ which means that ϕ is satisfied in W,ρ. For atomic formulas, the definition of the satisfaction relation is given in the next few subsections. We give the standard definition of the semantics of the other formulas:

  1. W,ρTrue always holds.
  2. W,ρFalse never holds.
  3. W,ρ¬ϕ holds if and only if W,ρϕ does not hold.
  4. W,ρϕψ holds if and only if W,ρϕ and W,iψ.
  5. W,ρϕψ holds if either W,ρϕ or W,ρψ holds.
  6. W,ρϕψ holds if W,ρϕ implies W,ρψ.
  7. W,ρx.ϕ holds if there exists some objObjects such that W,ρ[x:=obj]ϕ.
  8. W,ρx.ϕ holds if there for every objObjects we have W,ρ[x:=obj]ϕ.

In the semantics above, note that the domain of quantification is the set of Objects; that is, quantifiers range over entities, activities, agents, or interactions (which are in turn further subdivided into types of interactions). Things and relations cannot be referenced directly by identifiers.

A PROV instance I consists of a set of statements, each of which can be translated to an atomic formula following the definitional rules in PROV-CONSTRAINTS, possibly by introducing fresh existential variables. Thus, we can view an instance I as a set of atomic formulas {ϕ1,,ϕn}, or equivalently a single formula x1,,xk. ϕ1ϕn, where x1,,xk are the existential variables of I.

4.2 Attribute matching

We say that an object obj matches attributes [attr1=val1,...] in structure W provided: for each attribute attri, we have valiW.value(obj,attri). This is sometimes abbreviated as: match(W,obj,attrs).

4.3 Semantics of Element Formulas

4.3.1 Entity

An entity formula is of the form entity(id,attrs) where id denotes an entity.

Entity formulas entity(id,attrs) can be interpreted as follows:

W,ρentity(id,attrs) holds if and only if:

  1. [WF] id denotes an entity ent=ρ(id)Entities
  2. the attributes match: match(W,ent,attrs).

For example, the following formulas all hold if x denotes an entity e such that value(e,a)={4,5},value(e,b)={6} hold:

 entity(x,[])
 entity(x,[a=5])
 entity(x,[a=4,a=5])
 entity(x,[a=4,b=6])

Note that PROV-CONSTRAINTS normalization will merge these formulas to a single one:

  entity(x,[a=4,a=5,b=6])

4.3.2 Activity

An activity formula is of the form activity(id,st,et,attrs) where id is a identifier referring to the activity, st is a start time and et is an end time, and attrs are the attributes of activity id.

W,ρactivity(id,st,et,attrs) holds if and only if:

  1. [WF] The identifier id maps to an activity act=ρ(id)Activities
  2. If st is specified then it is equal to the start time of the activity, that is: min(lifetime(id))=st
  3. If et is specified then it is equal to the end time of the activity, that is: max(lifetime(id))=et
  4. The attributes match: match(W,act,attrs).

4.3.3 Agent

An agent formula is of the form agent(id,attrs) where id denotes the agent and attrs describes additional attributes.

W,ρagent(id,attrs) holds if and only if:

  1. [WF] id denotes an agent ag=ρ(id)Agents
  2. The attributes match: match(W,ag,attrs).

4.4 Semantics of Relations

4.4.1 Generation

The generation formula is of the form wasGeneratedBy(id,e,a,t,attrs) where id is an event identifier, e is an entity identifier, a is an activity identifier, attrs is a set of attribute-value pairs, and t is a time.

W,ρwasGeneratedBy(id,e,a,t,attrs) holds if and only if:

  1. [WF] The identifier id denotes an event evt=ρ(id)Events.
  2. [WF] The identifier e denotes an entity ent=ρ(e)Entities.
  3. [WF] The identifier a denotes an activity act=ρ(a)Activities.
  4. The event evt is involved in act, that is, (evt,act)EventActivities.
  5. The type of evt is generation, i.e. type(evt)=generation.
  6. The event evt occurred at time t, i.e. time(evt)=t.
  7. The event evt generated ent, i.e. (evt,ent)Generated.
  8. The attribute values match: match(W,evt,attrs).

4.4.2 Use

The use formula is of the form used(id,a,e,t,attrs) where id denotes an event, a is an activity identifier, e is an object identifier, attrs is a set of attribute-value pairs, and t is an optional time.

W,ρused(id,a,e,t,attrs) holds if and only if:

  • [WF] The identifier id denotes an event evt=ρ(id)Events.
  • [WF] The identifier a denotes an activity act=ρ(id)Activities.
  • [WF] The identifier e denotes an entity ent=ρ(e)Entities.
  • The event evt is part of act, i.e. (evt,act)EventActivities.
  • The type of evt is use, i.e., type(evt)=use.
  • The event evt occurred at time t, i.e. time(evt)=t.
  • The event evt used obj, i.e. (evt,ent)Used.
  • The attribute values match: match(W,evt,attrs).
  • 4.4.3 Invalidation

    The invalidation formula is of the form wasInvalidatedBy(id,e,a,t,attrs) where id is an event identifier, e is an entity identifier, a is an activity identifier, attrs is a set of attribute-value pairs, and t is an optional time.

    An invaliation formula W,ρwasInvalidatedBy(id,e,a,t,attrs) holds if and only if:

    1. [WF] The identifier id denotes an event evt=ρ(id)Events.
    2. [WF] The identifier e denotes an entity ent=ρ(e)Entities.
    3. [WF] The identifier a denotes an activity act=ρ(a)Activities.
    4. The event evt is involved in act, that is, (evt,act)EventActivities.
    5. The type of evt is invalidation, i.e. type(evt)=invalidation.
    6. The event evt occurred at time t, i.e. time(evt)=t.
    7. The event evt invalidated ent, i.e. (evt,ent)Invalidated.
    8. The attribute values match: match(W,evt,attrs).

    4.4.4 Association

    An association formula has the form wasAssociatedWith(id,a,ag,pl,attrs).

    W,ρwasAssociatedWith(id,a,ag,pl,attrs) holds if and only if:

    1. [WF] assoc denotes an association assoc=ρ(id)Associations.
    2. [WF] a denotes an activity act=ρ(a)Activities.
    3. [WF] ag denotes an agent agent=ρ(ag)Agents.
    4. [WF] pl is either the placeholder or denotes a plan plan=ρ(pl)Plans.
    5. The association associates the agent with the activity and plan, i.e. (assoc,agent,act,plan)AssociatedWith.
    6. The attributes match: match(W,assoc,attrs).

    4.4.5 Start Formulas

    A start formula wasStartedBy(id,a2,e,a1,attrs) is interpreted as follows:

    W,ρwasStartedBy(id,a2,e,a1,attrs) holds if and only if:

    1. [WF] id denotes an event evt=ρ(id)Events.
    2. [WF] a2 denotes an activity act2=ρ(a2)Activities.
    3. [WF] e denotes an entity ent=ρ(e)Entities.
    4. [WF] a1 denotes an activity act1=ρ(a1)Activities.
    5. The event evt has type start, i.e. type(evt)=start.
    6. The event happened at the start of act2, that is, (evt,act2)EventsActivities, and min(lifetime(act2))=time(evt).
    7. TODO: The entity e was generated by act1 and started act2.
    8. The event happened during act1, that is, (evt,act1)EventsActivities.
    9. The attributes match: match(W,evt,attrs).

    4.4.6 End

    An activity end formula wasEndedBy(id,a2,e,a1,attrs) is interpreted as follows:

    W,ρwasEndedBy(id,a2,e,a1,attrs) holds if and only if:

    1. [WF] id denotes an event evt=ρ(id)Events.
    2. [WF] a2 denotes an activity act2=ρ(a2)Activities.
    3. [WF] e denotes an entity ent=ρ(e)Entities.
    4. [WF] a1 denotes an activity act1=ρ(a1)Activities.
    5. The event evt has type end, i.e. type(evt)=end.
    6. The event happened at the end of act2, that is, (evt,act2)EventsActivities, and max(lifetime(act2))=time(evt).
    7. TODO: The entity e was generated by act1 and ended act2.
    8. The event happened during act1, that is, $(evt,act_1) \in EventActivities.
    9. The attributes match: match(W,evt,attrs).

    4.4.7 Attribution

    An attribution formula wasAttributedTo(id,e,ag,attrs) is interpreted as follows:

    W,ρwasAttributedTo(id,e,ag,attrs) holds if and only if:

    1. [WF] id denotes an association assoc=ρ(id)Associations.
    2. [WF] e denotes an entity ent=ρ(e)Entities.
    3. [WF] ag denotes an agent agent=ρ(ag)Agents.
    4. The entity was attributed to the agent, i.e. (assoc,ent,agent)AttributedTo.
    5. The attributes match: match(W,evt,attrs).

    4.4.8 Communication

    Note
    TODO: Communication

    4.4.9 Responsibility

    The actedOnBehalfOf(id,ag2,ag1,act,attrs) relation is interpreted using the ActsFor relation as follows:

    W,ρactedOnBehalfOf(id,ag2,ag1,act,attrs) holds if and only if:

    1. [WF] id denotes an association deleg=ρ(id)Delegations.
    2. [WF] a denotes an activity act=ρ(a)Activities.
    3. [WF] ag1,ag2 denote agents agent1=ρ(ag1),agent2=ρ(ag2)Agents.
    4. The agent agent2 acts for the agent agent1 with respect to the activity act, i.e. (deleg,agent2,agent1,act)ActsFor.
    5. The attributes match: match(W,assoc,attrs).

    4.4.10 Derivation

    4.4.10.1 Precise

    A precise derivation formula has the form wasDerivedFrom(id,e2,e1,a,g,u,attrs).

    W,ρwasDerivedFrom(id,e2,e1,act,g,u,attrs) holds if and only if:

    1. [WF] id denotes a derivation deriv=ρ(id)Derivations.
    2. [WF] e1,e2 denote entities ent1=ρ(e1),ent2=ρ(e2)Entities .
    3. [WF] a denotes an activity act=ρ(a)Activities.
    4. [WF] g denotes a generation event gen=ρ(g)Events and type(ρ(g))=generation.
    5. [WF] u denotes a use event ρ(u)Events and type(ρ(u))=use.
    6. The derivation denotes a one-step derivation derivedFrom(deriv)=ent2genactuseent1.
    7. The attribute values match: match(W,deriv,attrs).
    4.4.10.2 Imprecise

    An imprecise derivation formula has the form wasDerivedFrom(id,e2,e1,,,,attrs).

    W,ρwasDerivedFrom(id,e2,e1,,,,attrs) holds if and only if:

    1. [WF] id denotes a derivation deriv=ρ(id)Derivations
    2. [WF] e1,e2 denote entities ent1=ρ(e1),ent2=ρ(e2)Entities
    3. derivedFrom(deriv)=ent2went1 for some w
    4. The attribute values match: match(W,deriv,attrs).

    4.4.11 Influence

    Note
    TODO: Define influence semantics.

    4.4.12 Specialization

    The specializationOf(e1,e2) relation indicates when one entity formula presents more specific aspects of another.

    Note

    TODO: The content of this definition may be moved into the structure W via an irreflexive/transitive specialization relation, since by itself this definition is not transitive.

    W,ρspecializationOf(e1,e2) holds if and only if:

    1. [WF] Both e1 and e2 are entity identifiers, denoting distinct entities ent1=ρ(e1)Entities and ent2=ρ(e2)Entities, where ent1ent2.
    2. The two Entities refer to the same Thing, that is, thingOf(ent1)=thingOf(ent2).
    3. The lifetime of obj1 is contained in that of ent2, i.e. lifetime(ent1)lifetime(ent2).
    4. For each attribute attr we have value(obj1,attr)value(obj2,attr).

    The second criterion says that the two Entities present aspects of the same Thing. Note that the third criterion allows obj1 and obj2 to have the same lifetime (or that of obj2 can be larger). The last criterion allows obj1 to have more defined attributes than obj2, but they must include the attributes defined by obj2.

    4.4.13 Alternate

    The alternateOf relation indicates when two entity formulas present (possibly different) aspects of the same thing. The two entities may or may not overlap in time.

    W,ρalternateOf(e1,e2) holds if and only if:

    1. [WF] Both e1 and e2 are entity identifiers, denoting ent1=ρ(e1) and ent2=ρ(e2).
    2. The two objects refer to the same underlying Thing: thingOf(ent1)=thingOf(ent2)

    4.4.14 Membership

    The hadMember relation relates a collection to an element of the collection.

    W,ρhadMember(c,e) holds if and only if:

    1. [WF] Both e1 and e2 are entity identifiers, denoting coll=ρ(c)Collections and ent=ρ(e)Entities.
    2. TODO
    Note

    Additional constraints needed above to refer to (not yet defined) collection structure of entities/things.

    4.5 Auxiliary formulas

    In this section, we define the semantics of additional formulas concerning ordering, null values, and typing. These are used in the logical versions of constraints.

    4.5.1 Equals

    As usual, an equality formula means that two expressions denote the same value. Identifiers always denote Objects.

    W,ρx=y holds if and only if ρ(x)=ρ(y).

    4.5.2 Precedes and Strictly Precedes

    The precedes relation xy holds between two events, one taking place before (or simultaneously with) another. Since the reference semantics assumes that times are linearly ordered and event times are mapped to a single time line, this amounts to comparing the event times. The semantics of strictly precedes (xy is similar, only x must take place strictly before y).

    1. W,ρxy holds if and only if ρ(x),ρ(y)Events and time(ρ(x))time(ρ(y)).
    2. W,ρxy holds if and only if ρ(x),ρ(y)Events and time(ρ(x))<time(ρ(y)).

    4.5.3 notNull

    The notNull(x) formula is used to specify that a value may not be the null value . The symbol always denotes the null value (i.e. ρ()=).

    W,ρnotNull(e) holds if and only if e.

    4.5.4 typeOf

    The typing formula typeOf(x,t) constrains the type of the value of x.

    1. W,ρtypeOf(e,entity) holds if and only if ρ(e)Entities.
    2. W,ρtypeOf(a,activity) holds if and only if ρ(a)Activities.
    3. W,ρtypeOf(ag,agent) holds if and only if ρ(ag)Agents.
    4. W,ρtypeOf(c,prov:Collection) holds if and only if TODO.
    5. W,ρtypeOf(c,prov:EmptyCollection) holds if and only if TODO.
    Note
    TODO Collections

    5. Inferences and Constraints

    In this section we restate all of the inferences and constraints of PROV-CONSTRAINTS in terms of first-order logic. For each, we give a proof sketch showing why the inference or constraint is sound for reasoning about the reference semantics. We exclude the definitional rules in PROV-CONSTRAINTS because they are only needed for expanding the abbreviated forms of PROV-N statements to the logical formulas used here.

    5.1 Inferences

    id,a2,a1,attrs. wasInformedBy(id,a2,a1,attrs)e,gen,t1,use,t2. wasGeneratedBy(gen,e,a1,t1,[])used(use,a2,e,t2,[])
    gen,e,a1,t1,attrs1,id2,a2,t2,attrs2. wasGeneratedBy(gen,e,a1,t1,attrs1)used(id2,a2,e,t2,attrs2)id. wasInformedBy(id,a2,a1,[])
    e,attrs. entity(e,attrs)gen,a1,t1,inv,a2,t2. wasGeneratedBy(gen,e,a1,t1,[])wasInvalidatedBy(inv,e,a2,t2,[])
    a,t1,t2,attrs. activity(a,t1,t2,attrs)start,e1,a1,end,a2,e2. wasStartedBy(start,a,e1,a1,t1,[])wasEndedBy(end,a,e2,a2,t2,[])
    id,a,e1,a1,t,attrs. wasStartedBy(id,a,e1,a1,t,attrs)gen,t1. wasGeneratedBy(gen,e1,a1,t1,[])
    id,a,e1,a1,t,attrs. wasEndedBy(id,a,e1,a1,t,attrs)gen,t1. wasGeneratedBy(gen,e1,a1,t1,[])

    In this inference, none of a,gen2, or use1 can be placeholders -.

    id,e2,e1,a,gen2,use1,attrs. notNull(a)notNull(gen2)notNull(use1)wasDerivedFrom(id,e2,e1,a,gen2,use1,attrs)t1,t2. used(use1,a,e1,t1,[])wasGeneratedBy(gen2,e2,a,t2,[])

    In this inference, any of a,gen2, or use1 can be placeholders -.

    id,e1,e2,a,g,u. wasDerivedFrom(id,e2,e1,a,g,u,[prov:type=prov:Revision]))alternateOf(e2,e1)
    att,e,ag,attrs. wasAttributedTo(att,e,ag,attrs)a,t,gen,assoc,pl. wasGeneratedBy(gen,e,a,t,[])wasAssociatedWith(assoc,a,ag,pl,[])
    id,ag1,ag2,a,attrs. actedOnBehalfOf(id,ag1,ag2,a,attrs)id1,pl1,id2,pl2. wasAssociatedWith(id1,a,ag1,pl1,[])wasAssociatedWith(id2,a,ag2,pl2,[])
    1. id,e,a,t,attrs. wasGeneratedBy(id,e,a,t,attrs)wasInfluencedBy(id,e,a,attrs)
    2. id,a,e,t,attrs. used(id,a,e,t,attrs)wasInfluencedBy(id,a,e,attrs)
    3. id,a2,a1,attrs. wasInformedBy(id,a2,a1,attrs)wasInfluencedBy(id,a2,a1,attrs)
    4. id,a2,e,a1,t,attrs. wasStartedBy(id,a2,e,a1,t,attrs)wasInfluencedBy(id,a2,e,attrs)
    5. id,a2,e,a1,t,attrs. wasEndedBy(id,a2,e,a1,t,attrs)wasInfluencedBy(id,a2,e,attrs)
    6. id,e,a,t,attrs. wasInvalidatedBy(id,e,a,t,attrs)wasInfluencedBy(id,e,a,attrs)
    7. id,e2,e1,a,g,u,attrs. wasDerivedFrom(id,e2,e1,a,g,u,attrs)wasInfluencedBy(id,e2,e1,attrs)
    8. In this rule, a,g, or u may be placeholders -.

      id,e,ag,attrs. wasAttributedTo(id,e,ag,attrs)wasInfluencedBy(id,e,ag,attrs)
    9. In this rule, pl may be a placeholder -.

      id,a,ag,pl,attrs. wasAssociatedWith(id,a,ag,pl,attrs)wasInfluencedBy(id,a,ag,attrs)
    10. id,ag2,ag1,a,attrs. actedOnBehalfOf(id,ag2,ag1,a,attrs)wasInfluencedBy(id,ag2,ag1,attrs)
    e. entity(e)alternateOf(e,e)

    Suppose ent=rho(e). Clearly eEntities and thingOf(e)=thingOf(e), so W,ρalternateOf(e,e).

    e1,e2,e3. alternateOf(e1,e2)alternateOf(e2,e3)alternateOf(e1,e3)

    Suppose ent1=rho(e1) and ent2=ρ(e2) and ent3=ρ(e3). Then by assumption ent1, ent2, and ent3 are in Entities and thingOf(e1)=thingOf(e2) and thingOf(e2)=thingOf(e3), so thingOf(e1)=thingOf(e3), as required to conclude W,ρalternateOf(e2,e1).

    e1,e2. alternateOf(e1,e2)alternateOf(e2,e1)

    Suppose ent1=rho(e1) and ent2=ρ(e2). Then by assumption both ent1 and ent2 are in Entities and thingOf(e1)=thingOf(e2), as required to conclude W,ρalternateOf(e2,e1).

    e1,e2,e3. specializationOf(e1,e2)specializationOf(e2,e3)specializationOf(e1,e3)

    Suppose the conditions for specialization hold of ent1 and ent2 and for ent2 and ent3, where ent1=ρ(e1) and ent2=ρ(e2) and ent3=ρ(e3). Then lifetime(e1)lifetime(e2)lifetime(e3). Moreover, value(obj2,attr)value(obj3,attr), and similarly value(obj1,attr)value(obj2,attr) so value(obj1,attr)value(obj3,attr). (TODO: How do we know e3e1? Need strict ordering on entities in semantics.)

    e1,e2. specializationOf(e1,e2)alternateOf(e1,e2)

    If ent1=ρ(e1) and ent2=ρ(e2) are specializations, then thingOf(ent1)=thingOf(ent2).

    e1,attrs,e2. entity(e1,attrs)specializationOf(e2,e1)entity(e2,attrs)

    Suppose ent1=ρ(e1) and ent2=ρ(e2). Suppose (att,v) is an attribute-value pair in attrs. Since entity(e1,attrs) holds, we know that vvalue(ent1,att). Thus vvalue(ent2,att) since value(ent2,att)value(ent1,att). Since this is the case for all attribute-value pairs in attrs, and since e2 obviously denotes an entity, we can conclude W,ρentity(e,attrs).

    5.2 Constraints

    5.2.1 Uniqueness constraints

    1. The identifier field id is a KEY for the entity(id,attrs) statement.
    2. The identifier field id is a KEY for the activity(id,t1,t2,attrs) statement.
    3. The identifier field id is a KEY for the agent(id,attrs) statement.
    1. The identifier field id is a KEY for the wasGeneratedBy(id,e,a,t,attrs) statement.
    2. The identifier field id is a KEY for the used(id,a,e,t,attrs) statement.
    3. The identifier field id is a KEY for the wasInformedBy(id,a2,a1,attrs) statement.
    4. The identifier field id is a KEY for the wasStartedBy(id,a2,e,a1,t,attrs) statement.
    5. The identifier field id is a KEY for the wasEndedBy(id,a2,e,a1,t,attrs) statement.
    6. The identifier field id is a KEY for the wasInvalidatedBy(id,e,a,t,attrs) statement.
    7. The identifier field id is a KEY for the wasDerivedFrom(id,e2,e1,a,g2,u1,attrs) statement.
    8. The identifier field id is a KEY for the wasAttributedTo(id,e,ag,attrs) statement.
    9. The identifier field id is a KEY for the wasAssociatedWith(id,a,ag,pl,attrs) statement.
    10. The identifier field id is a KEY for the actedOnBehalfOf(id,ag2,ag1,a,attrs) statement.
    11. The identifier field id is a KEY for the wasInfluencedBy(id,o2,o1,attrs) statement.
    gen1,gen2,e,a,t1,t2,attrs1,attrs2. wasGeneratedBy(gen1,e,a,t1,attrs1)wasGeneratedBy(gen2,e,a,t2,attrs2)gen1=gen2
    inv1,inv2,e,a,t1,t2,attrs1,attrs2. wasInvalidatedBy(inv1,e,a,t1,attrs1)wasInvalidatedBy(inv2,e,a,t2,attrs2)inv1=inv2
    start1,start2,a,e1,e2,a0,t1,t2,attrs1,attrs2. wasStartedBy(start1,a,e1,a0,t1,attrs1)wasStartedBy(start2,a,e2,a0,t2,attrs2)start1=start2
    end1,end2,a,e1,e2,a0,t1,t2,attrs1,attrs2. wasEndedBy(end1,a,e1,a0,t1,attrs1)wasEndedBy(end2,a,e2,a0,t2,attrs2)end1=end2
    start,a1,a2,t,t1,t2,e,attrs,attrs1. activity(a2,t1,t2,attrs)wasStartedBy(start,a2,e,a1,t,attrs1)t1=t
    end,a1,a2,t,t1,t2,e,attrs,attrs1. activity(a2,t1,t2,attrs)wasEndedBy(end,a2,e,a1,t,attrs1)t2=t

    5.2.2 Ordering constraints

    start,end,a,e1,e2,a1,a2,t1,t2,attrs1,attrs2. wasStartedBy(start,a,e1,a1,t1,attrs1)wasEndedBy(end,a,e2,a2,t2,attrs2)startend
    start1,start2,a,e1,e2,a1,a2,t1,t2,attrs1,attrs2. wasStartedBy(start1,a,e1,a1,t1,attrs1)wasStartedBy(start2,a,e2,a2,t2,attrs2)start1start2
    end1,end2,a,e1,e2,a1,a2,t1,t2,attrs1,attrs2. wasEndedBy(end1,a,e1,a1,t1,attrs1)wasEndedBy(end2,a,e2,a2,t2,attrs2)end1end2
    1. start,use,a,e1,e2,a1,t1,t2,attrs1,attrs2. wasStartedBy(start,a,e1,a1,t1,attrs1)used(use,a,e2,t2,attrs2)startuse
    2. use,end,a,e1,e2,a2,t1,t2,attrs1,attrs2. used(use,a,e1,t1,attrs1)wasEndedBy(end,a,e2,a2,t2,attrs2)useend
    1. start,gen,e1,e2,a,a1,t1,t2,attrs1,attrs2. wasStartedBy(start,a,e1,a1,t1,attrs1)wasGeneratedBy(gen,e2,a,t2,attrs2)startgen
    2. gen,end,e,e1,a,a1,t,t1,attrs,attrs1. wasGeneratedBy(gen,e,a,t,attrs)wasEndedBy(end,a,e1,a1,t1,attrs1)genend
    id,start,end,a1,a1,a2,a2,e1,e2,t1,t2,attrs,attrs1,attrs2. wasInformedBy(id,a2,a1,attrs)wasStartedBy(start,a1,e1,a1,t1,attrs1)wasEndedBy(end,a2,e2,a2,t2,attrs2)startend
    gen,inv,e,a1,a2,t1,t2,attrs1,attrs2. wasGeneratedBy(gen,e,a1,t1,attrs1)wasInvalidatedBy(inv,e,a2,t2,attrs2)geninv
    gen,use,e,a1,a2,t1,t2,attrs1,attrs2. wasGeneratedBy(gen,e,a1,t1,attrs1)used(use,a2,e,t2,attrs2)genuse
    use,inv,a1,a2,e,t1,t2,attrs1,attrs2. used(use,a1,e,t1,attrs1)wasInvalidatedBy(inv,e,a2,t2,attrs2)useinv
    gen1,gen2,e,a1,a2,t1,t2,attrs1,attrs2. wasGeneratedBy(gen1,e,a1,t1,attrs1)wasGeneratedBy(gen2,e,a2,t2,attrs2)gen1gen2
    inv1,inv2,e,a1,a2,t1,t2,attrs1,attrs2. wasInvalidatedBy(inv1,e,a1,t1,attrs1)wasInvalidatedBy(inv2,e,a2,t2,attrs2)inv1inv2

    In this constraint, a,gen2, or use1 must not be placeholders -.

    d,e1,e2,a,gen2,use1,attrs. notNull(a)notNull(gen2)notNull(use1)wasDerivedFrom(d,e2,e1,a,gen2,use1,attrs)use1gen2

    In this constraint, any of a,g, or u may be placeholders -.

    d,gen1,gen2,e1,e2,a,a1,a2,g,u,t1,t2,attrs,attrs1,attrs2. wasDerivedFrom(d,e2,e1,a,g,u,attrs)wasGeneratedBy(gen1,e1,a1,t1,attrs1)wasGeneratedBy(gen2,e2,a2,t2,attrs2)gen1gen2
    1. gen,start,e,a,a1,a2,t1,t2,attrs1,attrs2. wasGeneratedBy(gen,e,a1,t1,attrs1)wasStartedBy(start,a,e,a2,t2,attrs2)genstart
    2. start,inv,e,a,a1,a2,t1,t2,attrs1,attrs2. wasStartedBy(start,a,e,a1,t1,attrs1)wasInvalidatedBy(inv,e,a2,t2,attrs2)startinv
    1. gen,end,e,a,a1,a2,t1,t2,attrs1,attrs2. wasGeneratedBy(gen,e,a1,t1,attrs1)wasEndedBy(end,a,e,a2,t2,attrs2)genend
    2. end,inv,e,a,a1,a2,t1,t2,attrs1,attrs2. wasEndedBy(end,a,e,a1,t1,attrs1)wasInvalidatedBy(inv,e,a2,t2,attrs2)endinv
    gen1,gen2,e1,e2,a1,a2,t1,t2,attrs1,attrs2. specializationOf(e2,e1)wasGeneratedBy(gen1,e1,a1,t1,attrs1)wasGeneratedBy(gen2,e2,a2,t2,attrs2)gen1gen2
    inv1,inv2,e1,e2,a1,a2,t1,t2,attrs1,attrs2. specializationOf(e1,e2)wasInvalidatedBy(inv1,e1,a1,t1,attrs1)wasInvalidatedBy(inv2,e2,a2,t2,attrs2)inv1inv2

    In the following inferences, pl may be a placeholder -.

    1. assoc,start1,inv2,ag,e1,e2,a1,a2,t1,t2,attrs1,attrs2. wasAssociatedWith(assoc,a,ag,pl,attrs)wasStartedBy(start1,a,e1,a1,t1,attrs1)wasInvalidatedBy(inv2,ag,a2,t2,attrs2)start1inv2
    2. assoc,gen1,end2,ag,e1,e2,a1,a2,t1,t2,attrs1,attrs2. wasAssociatedWith(assoc,a,ag,pl,attrs)wasGeneratedBy(gen1,ag,a1,t1,attrs1)wasEndedBy(end2,a,e2,a2,t2,attrs2)gen1end2
    3. assoc,start1,end2,ag,e1,e2,a1,a2,t1,t2,attrs1,attrs2. wasAssociatedWith(assoc,a,ag,pl,attrs)wasStartedBy(start1,a,e1,a1,t1,attrs1)wasEndedBy(end2,ag,e2,a2,t2,attrs2)start1end2
    4. assoc,start1,end2,ag,e1,e2,a1,a2,t1,t2,attrs1,attrs2. wasAssociatedWith(assoc,a,ag,pl,attrs)wasStartedBy(start1,ag,e1,a1,t1,attrs1)wasEndedBy(end2,a,e2,a2,t2,attrs2)start1end2
    1. att,gen1,gen2,e,a1,a2,t1,t2,ag,attrs,attrs1,attrs2. wasAttributedTo(att,e,ag,attrs)wasGeneratedBy(gen1,ag,a1,t1,attrs1)wasGeneratedBy(gen2,e,a2,t2,attrs2)gen1gen2
    2. att,start1,gen2,e,e1,a1,a2,ag,t1,t2,attrs,attrs1,attrs2. wasAttributedTo(att,e,ag,attrs)wasStartedBy(start1,ag,e1,a1,t1,attrs1)wasGeneratedBy(gen2,e,a2,t2,attrs2)start1gen2
    1. del,gen1,inv2,ag1,ag2,a,a1,a2,t1,t2,attrs,attrs1,attrs2. actedOnBehalfOf(del,ag2,ag1,a,attrs)wasGeneratedBy(gen1,ag1,a1,t1,attrs1)wasInvalidatedBy(inv2,ag2,a2,t2,attrs2)gen1inv2
    2. del,start1,end2,ag1,ag2,a,a1,a2,e1,e2,t1,t2,attrs,attrs1,attrs2. actedOnBehalfOf(del,ag2,ag1,a,attrs)wasStartedBy(start1,ag1,e1,a1,t1,attrs1)wasEndedBy(end2,ag2,e2,a2,t2,attrs2)start1end2

    5.2.3 Typing constraints

    1. e,attrs. entity(e,attrs)typeOf(e,entity)
    2. ag,attrs. agent(ag,attrs)typeOf(ag,agent)
    3. a,t1,t2,attrs. activity(a,t1,t2,attrs)typeOf(a,activity)
    4. u,a,e,t,attrs. used(u,a,e,t,attrs)typeOf(a,activity)typeOf(e,entity)
    5. g,a,e,t,attrs. wasGeneratedBy(g,e,a,t,attrs)typeOf(a,activity)typeOf(e,entity)
    6. inf,a2,a1,t,attrs. wasInformedBy(inf,a2,a1,t,attrs)typeOf(a1,activity)typeOf(a2,activity)
    7. start,a2,e,a1,t,attrs. wasStartedBy(start,a2,e,a1,t,attrs)typeOf(a1,activity)typeOf(a2,activity)typeOf(e,entity)
    8. end,a2,e,a1,t,attrs. wasEndedBy(end,a2,e,a1,t,attrs)typeOf(a1,activity)typeOf(a2,activity)typeOf(e,entity)
    9. inv,a,e,t,attrs. wasInvalidatedBy(inv,e,a,t,attrs)typeOf(a,activity)typeOf(e,entity)
    10. id,e2,e1,a,g2,u1,attrs. notNull(a)notNull(g2)notNull(u1)wasDerivedFrom(id,e2,e1,a,g2,u1,attrs)typeOf(e2,entity)typeOf(e1,activity)typeOf(a,activity)
    11. id,e2,e1,attrs. wasDerivedFrom(id,e2,e1,,,,attrs)typeOf(e2,entity)typeOf(e1,activity)
    12. id,e,ag,attrs. wasAttributedTo(id,e,ag,attrs)typeOf(e,entity)typeOf(ag,agent)
    13. id,a,ag,pl,attrs. notNull(pl)wasAssociatedWith(id,a,ag,pl,attrs)typeOf(a,activity)typeOf(ag,agent)typeOf(pl,entity)
    14. id,a,ag,attrs. wasAssociatedWith(id,a,ag,,attrs)typeOf(a,activity)typeOf(ag,agent)
    15. id,ag2,ag1,a,attrs. actedOnBehalfOf(id,ag2,ag1,a,attrs)typeOf(ag2,agent)typeOf(ag1,agent)typeOf(a,activity)
    16. e2,e1. alternateOf(e2,e1)typeOf(e2,entity)typeOf(e1,entity)
    17. e2,e1. specializationOf(e2,e1)typeOf(e2,entity)typeOf(e1,entity)
    18. c,e. hadMember(c,e)typeOf(c,prov:Collection)typeOf(e,entity)
    19. c. entity(c,[prov:type=prov:emptyCollection]))typeOf(c,entity)typeOf(c,prov:Collection)typeOf(c,prov:EmptyCollection)

    Each typing constraint follows immediately from well-formedness criteria marked [WF] in the corresponding semantics for formulas.

    5.2.4 Impossibility constraints

    1. id,e1,e2,g,attrs. notNull(g)wasDerivedFrom(id,e2,e1,,g,,attrs)False
    2. id,e1,e2,u,attrs. notNull(u)wasDerivedFrom(id,e2,e1,,,u,attrs)False
    3. id,e1,e2,g,u,attrs. notNull(g)notNull(u)wasDerivedFrom(id,e2,e1,,g,u,attrs)False

    Each part follows from the fact that the semantics of wasDerivedFrom only allows formulas to hold when either all three of a,g,u are (denoting ) or none of them are.

    e. specializationOf(e,e)False

    This follows from the fact that in the semantics of specializationOf, the two entities denoted by the first and second arguments are required to be distinct.

    For each r and s{used,wasGeneratedBy,wasInvalidatedBy,wasStartedBy,wasEndedBy,wasInformedBy,wasAttributedTo,wasAssociatedWith,actedOnBehalfOf} such that r and s are different relation names, the following constraint holds:

    id,a1,,am,b1,,bn. r(id,a1,,am)s(id,b1,,bn)False

    This follows from the assumption that the different classes of interactions are disjoint sets, characterized by their types.

    For each p{entity,activity,agent} and each r{used,wasGeneratedBy,wasInvalidatedBy,wasStartedBy,wasEndedBy,wasInformedBy,wasAttributedTo,wasAssociatedWith,actedOnBehalfOf}, the following constraint holds:

    id,a1,,am,b1,,bn. p(id,a1,,am)r(id,b1,,bn)False

    This follows from the assumption that interactions are distinct from other objects (entities, activities or agents).

    id. typeOf(id,entity)typeOf(id,activity)False

    This follows from the assumption that entities and activities are disjoint.

    c,e. hasMember(c,e)typeOf(c,prov:EmptyCollection)False

    6. Soundness

    Above we have presented arguments for the soundness of the constraints and inferences with respect to the reference semantics. This implies the following:

    1. If I is an instance and WI and I is obtained from I by applying one of the PROV inferences, then WI.
    2. If I is an instance and WI then I has a normal form I and WI.
    3. If I is a normal form and WI then I satisfies all of the ordering, typing and impossibility constraints.
    4. If WI then I is valid.

    For part 1, the arguments are as in the previous section.

    For part 2, proceed by induction on a terminating sequence of inference or uniqueness constraint steps: if I is in normal form them we are done. If I is not in normal form then if an inference is applicable, then use part 1; if a uniqueness constraint is applicable, then from WI the uniqueness constraint cannot fail on I and WI.

    For part 3, the arguments are as in the previous section for each constraint.

    Finally, for part 4, suppose WI. Then WI where I is the normal form of I by part 2. By part 3, I satisfies all of the remaining constraints, so I is valid.

    A. Acknowledgements

    This document has been produced by the PROV Working Group, and its contents reflect extensive discussion within the Working Group as a whole.

    Thanks also to Robin Berjon for the ReSPec.js specification writing tool and to MathJax for their LaTeX-to-HTML conversion tools, both of which aided in the preparation of this document.

    Members of the PROV Working Group at the time of publication of this document were: Ilkay Altintas (Invited expert), Reza B'Far (Oracle Corporation), Khalid Belhajjame (University of Manchester), James Cheney (University of Edinburgh, School of Informatics), Sam Coppens (IBBT), David Corsar (University of Aberdeen, Computing Science), Stephen Cresswell (The National Archives), Tom De Nies (IBBT), Helena Deus (DERI Galway at the National University of Ireland, Galway, Ireland), Simon Dobson (Invited expert), Martin Doerr (Foundation for Research and Technology - Hellas(FORTH)), Kai Eckert (Invited expert), Jean-Pierre EVAIN (European Broadcasting Union, EBU-UER), James Frew (Invited expert), Irini Fundulaki (Foundation for Research and Technology - Hellas(FORTH)), Daniel Garijo (Universidad Politécnica de Madrid), Yolanda Gil (Invited expert), Ryan Golden (Oracle Corporation), Paul Groth (Vrije Universiteit), Olaf Hartig (Invited expert), David Hau (National Cancer Institute, NCI), Sandro Hawke (W3C/MIT), Jörn Hees (German Research Center for Artificial Intelligence (DFKI) Gmbh), Ivan Herman, (W3C/ERCIM), Ralph Hodgson (TopQuadrant), Hook Hua (Invited expert), Trung Dong Huynh (University of Southampton), Graham Klyne (University of Oxford), Michael Lang (Revelytix, Inc.), Timothy Lebo (Rensselaer Polytechnic Institute), James McCusker (Rensselaer Polytechnic Institute), Deborah McGuinness (Rensselaer Polytechnic Institute), Simon Miles (Invited expert), Paolo Missier (School of Computing Science, Newcastle university), Luc Moreau (University of Southampton), James Myers (Rensselaer Polytechnic Institute), Vinh Nguyen (Wright State University), Edoardo Pignotti (University of Aberdeen, Computing Science), Paulo da Silva Pinheiro (Rensselaer Polytechnic Institute), Carl Reed (Open Geospatial Consortium), Adam Retter (Invited Expert), Christine Runnegar (Invited expert), Satya Sahoo (Invited expert), David Schaengold (Revelytix, Inc.), Daniel Schutzer (FSTC, Financial Services Technology Consortium), Yogesh Simmhan (Invited expert), Stian Soiland-Reyes (University of Manchester), Eric Stephan (Pacific Northwest National Laboratory), Linda Stewart (The National Archives), Ed Summers (Library of Congress), Maria Theodoridou (Foundation for Research and Technology - Hellas(FORTH)), Ted Thibodeau (OpenLink Software Inc.), Curt Tilmes (National Aeronautics and Space Administration), Craig Trim (IBM Corporation), Stephan Zednik (Rensselaer Polytechnic Institute), Jun Zhao (University of Oxford), Yuting Zhao (University of Aberdeen, Computing Science).

    B. References

    B.1 Informative references

    [Logic]
    W. E. Johnson. Logic: Part III.1924. URL: http://www.ditext.com/johnson/intro-3.html
    [PROV-AQ]
    Graham Klyne; Paul Groth; eds. Provenance Access and Query. 12 March 2013, Working Draft. URL: http://www.w3.org/TR/2013/WD-prov-aq-20130312/
    [PROV-CONSTRAINTS]
    James Cheney; Paolo Missier; Luc Moreau; eds. Constraints of the PROV Data Model. 12 March 2013, W3C Proposed Recommendation. URL: http://www.w3.org/TR/2013/PR-prov-constraints-20130312/
    [PROV-DC]
    Daniel Garijo; Kai Eckert; eds. Dublin Core to PROV Mapping. 12 March 2013, Working Draft. URL: http://www.w3.org/TR/2013/WD-prov-dc-20130312/
    [PROV-DICTIONARY]
    Tom De Nies; Sam Coppens; eds. PROV Dictionary. 12 March 2013, Working Draft. URL: http://www.w3.org/TR/2013/WD-prov-dictionary-20130312/
    [PROV-DM]
    Luc Moreau; Paolo Missier; eds. PROV-DM: The PROV Data Model. 12 March 2013, W3C Proposed Recommendation. URL: http://www.w3.org/TR/2013/PR-prov-dm-20130312/
    Luc Moreau; Timothy Lebo; eds. Linking Across Provenance Bundles. 12 March 2013, Working Draft. URL: http://www.w3.org/TR/2013/WD-prov-links-20130312/
    [PROV-N]
    Luc Moreau; Paolo Missier; eds. PROV-N: The Provenance Notation. 12 March 2013, W3C Proposed Recommendation. URL: http://www.w3.org/TR/2013/PR-prov-n-20130312/
    [PROV-O]
    Timothy Lebo; Satya Sahoo; Deborah McGuinness; eds. PROV-O: The PROV Ontology. 12 March 2013, W3C Proposed Recommendation. URL: http://www.w3.org/TR/2013/PR-prov-o-20130312/
    [PROV-OVERVIEW]
    Paul Groth; Luc Moreau; eds. PROV-OVERVIEW: An Overview of the PROV Family of Documents. 12 March 2013, Working Draft. URL: http://www.w3.org/TR/2013/WD-prov-overview-20130312/
    [PROV-PRIMER]
    Yolanda Gil; Simon Miles; eds. PROV Model Primer. 12 March 2013, Working Draft. URL: http://www.w3.org/TR/2013/WD-prov-primer-20130312/
    [PROV-SEM]
    James Cheney; ed. Semantics of the PROV Data Model. 12 March 2013, Working Draft. URL: http://www.w3.org/TR/2013/WD-prov-sem-20130312.
    [PROV-XML]
    Hook Hua; Curt Tilmes; Stephan Zednik; eds. PROV-XML: The PROV XML Schema. 12 March 2013, Working Draft. URL: http://www.w3.org/TR/2013/WD-prov-xml-20130312/