The W3C PROV Provenance Model

 

Luc Moreau

Co-chair of W3C Provenance Working Group

 

Warning: everything in this presentation is a DRAFT.

 

Thanks to Paul Groth, Paolo Missier, James Cheney, and the entire W3C Provenance Working Group

http://dvcs.w3.org/hg/prov/raw-file/default/presentations/wais-2012-18-14/prov-dm/overview/index.html (latest)

Talk Outline

Working Group Charter

charter http://lists.w3.org/Archives/Public/public-prov-wg/

Participants

  • DERI Galway
  • European Broadcasting Union
  • FORTH
  • Financial Services Technology Consortium
  • DFKI
  • IBBT
  • Library of Congress
  • NASA
  • National Cancer Institute
  • Open Geospatial Consortium
  • OpenLink Software
  • Oracle
  • Pacific Northwest National Laboratory
  • Rensselaer Polytechnic Institute
  • Revelytix, Inc
  • Newcastle University
  • The National Archives
  • TopQuadrant
  • Universidad Politecnica de Madrid
  • University of Edinburgh
  • University of Manchester
  • University of Oxford
  • University of Southampton
  • VU University Amsterdam
  • Wright State University

A Definition of Provenance

Interchange

The idea that a single way of representing and collecting provenance could be adopted internally by all systems does not seem to be realistic today.

Instead, a pragmatic approach is to consider a core data model for provenance that allows domain and application specific representations of provenance to be translated into such a data model and exchanged between systems.

Heterogeneous systems can then export their provenance into such a core data model, and applications that need to make sense of provenance in heterogeneous systems can then import it, process it, and reason over it.

Thus, the vision is that different provenance-aware systems natively adopt their own model for representing their provenance, but a core provenance data model can be readily adopted as a provenance interchange model across such systems.

Layered Model

layered model

Layered Model

layered model

Example

Example

Example

PROV-DM overview

Example (reports)

PROV-DM overview entity(tr:WD-prov-dm-20111018, [ prov:type='process:RecsWD' ])
entity(tr:WD-prov-dm-20111215, [ prov:type='process:RecsWD' ])

Example (activities)

PROV-DM overview activity(ex:pub1,[prov:type="publish"])
activity(ex:pub2,[prov:type="publish"])

Example (agent)

PROV-DM overview agent(w3:Consortium, [ prov:type="Organization" ])

Example (plan)

PROV-DM overview entity(process:rec-advance, [ prov:type='prov:Plan' ])

Example (requests)

PROV-DM overview entity(0004, [ prov:type='trans:transreq' ])
entity(0141, [ prov:type='trans:pubreq' ])
entity(0111, [ prov:type='trans:pubreq' ])

Example (usage)

PROV-DM overview used(ex:pub1,ar1:0004)
used(ex:pub1,ar2:0141)
used(ex:pub2,ar3:0111)

Example (generation)

PROV-DM overview wasGeneratedBy(tr:WD-prov-dm-20111018, ex:pub1)
wasGeneratedBy(tr:WD-prov-dm-20111215, ex:pub2)

Example (derivation)

PROV-DM overview wasDerivedFrom(tr:WD-prov-dm-20111215,tr:WD-prov-dm-20111018)

Example (association)

PROV-DM overview wasAssociatedWith(ex:pub2, w3:Consortium, pr:rec-advance)

PROV Data Model Components

PROV-DM components

PROV Data Model Structure

PROV-DM overview

Relations at a Glance

PROV-DM Types and Relations

Component 1: Entities and Activities

PROV-DM overview

Entity

some note here

Activity

some note here

Generation

some note here

Usage

some note here

Component 2: Agents and Responsibility

Agents-Responsibility

Agent

some note here

Attribution

some note here

Association

some note here

Responsibility

some note here

Component 3: Derivations

Derivations

Derivation

some note here

Component 4: Alternates

Alternates

Alternate

some note here

Specialization

some note here

Component 5: Collections

Collections

The PROV Family: The PROV Notation

layered model

PROV-N: The PROV Notation

Example in PROV-N


entity(tr:WD-prov-dm-20111018, [ prov:type="pr:RecsWD" %% xsd:QName ])
entity(tr:WD-prov-dm-20111215, [ prov:type="pr:RecsWD" %% xsd:QName ])
entity(pr:rec-advance,         [ prov:type="prov:Plan" %% xsd:QName ])


entity(ar1:0004, [ prov:type="http://www.w3.org/2005/08/01-transitions.html#transreq" %% xsd:anyURI ])
entity(ar2:0141, [ prov:type="http://www.w3.org/2005/08/01-transitions.html#pubreq" %% xsd:anyURI ])
entity(ar3:0111, [ prov:type="http://www.w3.org/2005/08/01-transitions.html#pubreq" %% xsd:anyURI ])


wasDerivedFrom(tr:WD-prov-dm-20111215,tr:WD-prov-dm-20111018)


activity(ex:pub1,,,[prov:type="publish"])
activity(ex:pub2,,,[prov:type="publish"])


wasGeneratedBy(tr:WD-prov-dm-20111018, ex:pub1)
wasGeneratedBy(tr:WD-prov-dm-20111215, ex:pub2)

used(ex:pub1,ar1:0004)
used(ex:pub1,ar2:0141)
used(ex:pub2,ar3:0111)

agent(w3:Consortium, [ prov:type="Organization" ])

wasAssociatedWith(ex:pub1, w3:Consortium  @ pr:rec-advance)
wasAssociatedWith(ex:pub2, w3:Consortium  @ pr:rec-advance)

The PROV Family: PROV-O

layered model

PROV-O: An OWL2 Ontology for PROV

PROV-O overview

PROV-AQ: Provenance Access and Query

layered model

The PROV Family: PROV Constraints

layered model

Time

some note here

Events

some note here

Partial States

entity(tr:WD-prov-dm-20111215, [ prov:type="pr:RecsWD" %% xsd:QName ])
entity(tr:WD-prov-dm-20111215, [ prov:type="document", ex:version="2" ])

some note here

Constraints

constraints

 

 

some note here

Constraints (2)

constraints

 

 

Account

It is common for multiple provenance records to co-exist.

For instance, when emailing a file, there could be a provenance record kept by the mail client, and another by the mail server.

Given that multiple provenance records can co-exist, it is important to have details about their origin, who they are attributed to, how they were generated, etc. In other words, an important requirement is to be able to express the provenance of provenance.

Account

PROV-DM does not provide an actual mechanism for creating accounts, i.e. for bundling up provenance descriptions and naming them. Accounts MUST satisfy some properties:

Account Constraints

 

Account Constraints (2)

 

Conclusion: Specifications

prov-primer http://www.w3.org/TR/prov-primer/
prov-o http://www.w3.org/TR/prov-o/
prov-dm http://www.w3.org/TR/prov-dm/
prov-dm-constraints http://www.w3.org/TR/prov-dm-constraints/
prov-n http://www.w3.org/TR/prov-n/
prov-aq http://www.w3.org/TR/prov-aq/
prov-sem Work in progress
prov-xml work in progress
best practice work in progress
-
prov-json Southampton contribution
prov-datalog Paolo Missier

Conclusion: Implementations

ProvToolbox https://github.com/lucmoreau/ProvToolbox
provpy https://github.com/trungdong/w3-prov/tree/master/provpy
CollabMap Trung Dong Huynh
AgentSwitch Trung Dong Huynh
PoN (Patina of Notes) Mike Jewell, Enrico Costanza
DEEP (estat ebook) Danius Michaelides, Huanjia Yang, Alex

/

#
Show Speaker Notes