W3C

PROV Implementation Report

W3C Working Draft 12 March 2013

This version:
http://www.w3.org/TR/2013/WD-prov-implementations-20130312/
Latest published version:
http://www.w3.org/TR/prov-implementations/
Latest editor's draft:
http://dvcs.w3.org/hg/prov/raw-file/default/reports/prov-implementations.html
Previous version:
Editors:
Trung Dong Huynh, University of Southampton
Paul Groth, VU University Amsterdam
Stephan Zednik, Rensselaer Polytechnic Institute

Abstract

This document reports on implementations and usage of the four normative specifications ([PROV-DM], [PROV-N], [PROV-O], [PROV-CONSTRAINTS]) of the PROV Family of Documents [PROV-OVERVIEW]. In particular, it's aim is to demonstrate that the features defined in PROV are implementable and interoperable. Features are defined as: the constructs specified in [PROV-DM] and their realisation in OWL (see [PROV-O]) and in the [PROV-N] syntax; the constraints defined within [PROV-CONSTRAINTS]. Interoperability is defined through both the interchange of provenance information and the coverage of test cases.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

During the Candidate Recommendation period of PROV, implementation experience was reported. This document summarises those experiences. The version at http://www.w3.org/TR/2013/WD-prov-implementations-20130312/ is the version used for purposes of transition to proposed recommendation. For comments, please send a mail to public-prov-comments@w3.org [archive].

This document was published by the Provenance Working Group as a First Public Working Draft. If you wish to make comments regarding this document, please send them to public-prov-comments@w3.org (subscribe, archives). All comments are welcome.

Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. The group does not expect this document to become a W3C Recommendation. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1. Introduction

The goal of PROV is to enable interoperable interchange of provenance on the Web. We take two approaches to documenting the implementation and interoperability of PROV.

  1. For the data model [PROV-DM] and the two working group defined serializations, we document the existence of multiple implementations that support each of the constructs within the data model and that there exists at least two implementations that are reported to exchange these constructs.
  2. We document that the PROV-Constraints specification is implementable by documenting the existence of at least two implementations that report to support all the defined constraints. To evaluate the coverage of implementations, the PROV-Constraints Test Cases are used as a point of reference. There are 280 test cases in total, which map to the the constraints defined by the document.

PROV is useful not only for applications/programs but also for exposing provenance within datasets and as a foundation for other vocabularies. We also document that usage as well.

Implementation evidence was gathered using four surveys.

This report summarizes the results of these surveys.

1.1 Meeting the Exit Criteria

At the start of the Candidate Recommendation phase, the Working Group defined a series of exit criteria. These exit criteria can be summarized as for each feature defined by PROV there are at least two implementations that support the feature and that there exists one interoperability pair that can exchange that feature. Section 3.1 shows that a minimum of 4 implementations both produce and consume all constructs defined in PROV-DM. PROV-O is implemented by over 40 implementations and PROV-N is implemented by 7 implementations.

In terms of implementation pairs, Section 4 enumerates which pairs of implementations report exchanging provenance. Here, we meet the exit criteria in that each feature is exchanged by at least two implementations.

Finally, three validators have implemented all of the constraints defined in PROV-Constraints passing the requisite test cases, thus, passing the exit criteria. The working group recognizes that implementing the PROV-Constraints document requires substantial effort. It is nice to see that three radically different approaches were chosen to implement this specification: SPARQL, Java, Prolog, which speaks to the implementability of this specification.

For a systematic enumeration of how the exit criteria were met, please see http://www.w3.org/2011/prov/wiki/MeetingProvCRExitCriteria

2. List of Implementations

The following lists the reported implementations, the type of implementation, supported PROV encodings and the URL of the implementation.

Implementation Type:

Table 1: List of implementations reported to the PROV Working Group.
# Name Type PROV Encodings
1 StatJR eBook system Application PROV-O, PROV-JSON
2 PoN Application PROV-N
3 WebLab-PROV Application PROV-O, PROV-N
4 Taverna Application PROV-O
5 CollabMap Application PROV-JSON
6 WingsProvenanceExport Application PROV-O
7 ProvToolbox Framework / API
Service
PROV-O, PROV-N, PROV-XML, PROV-JSON
8 Provenance for Earth Science Application
Service
PROV-O
9 Provenance Environment (ProvEn) Services Application
Service
PROV-O
10 Annotation Inference Framework Application
Framework / API
PROV-O, PROV-N, PROV-XML, PROV-JSON
11 PROVoKing Framework / API PROV-O
12 Triplify Service PROV-O
13 Prov-gen Application PROV-N
14 OBIAMA (Ontology-Based Integrated Action Modelling Arena Application PROV-O
15 Amalgame Application
Framework / API
Service
PROV-O
16 D2R Server Service PROV-O
17 Provenance server Service PROV-N, PROV-JSON
18 agentSwitch Application PROV-N, PROV-JSON
19 Oracle Enterprise Transactions Controls Governor 8.6.4 Application PROV-O, PROV-XML
20 Pubby Service PROV-O
21 Semantic Proteomics Dashboard (SemPoD) Application PROV-O
22 DeFacto Application PROV-O
23 Quality Assessment Framework Framework / API PROV-O
24 Global Change Information System - Information Model and Semantic Application Prototypes Application PROV-O
25 OpenUp Prov Application PROV-O
26 APROVeD: Automatic Provenance Derivation Application PROV-N, PROV-JSON
27 Raw2LD Application PROV-O
28 PROV-N to Neo4J DB mapping Application PROV-N
29 Earth System Science Server Application PROV-XML, PROV-JSON
30 prov-api Framework / API PROV-O
31 Policy Reasoning Framework Framework / API PROV-O
32 Informed Rural Passenger Information Infrastructure Application PROV-O
33 PubFlow Provenance Archive Application
Framework / API
PROV-O, PROV-XML
34 PROV Python library Framework / API PROV-N, PROV-JSON
35 csv2rdf4lod-automation Application PROV-O
36 recoprov Application PROV-O, PROV-N
37 DataFAQs Application PROV-O
38 provx2o Application PROV-O, PROV-XML
39 Hedgehog Application PROV-XML
40 QuerioCity research prototype Application
Framework / API
Service
PROV-O
41 Human Computation ontology Vocabulary Extension PROV-O
42 tavernaprov Vocabulary Extension PROV-O
43 The Open Provenance Model for Workflows (OPMW) Vocabulary Extension PROV-O
44 roevo Vocabulary Extension PROV-O
45 wfprov Vocabulary Extension PROV-O
46 P-plan Vocabulary Extension PROV-O
47 Jun Zhao Vocabulary Extension PROV-O
48 Systems molecular biology provenance ontology (SysPro) Vocabulary Extension PROV-O, None
49 Yanfeng Shu Vocabulary Extension PROV-O
50 ISO_19115_Lineage Vocabulary Extension PROV-O, none
51 PAV Provenance, Authoring and Versioning Vocabulary Extension PROV-O
52 cProv Vocabulary Extension PROV-N, PROV-XML
53 PML 3.0 Vocabulary Extension PROV-O
54 Music Ontology to Media Value Chain Ontology and PROV-O Ontology Mapping Vocabulary Usage PROV-O
55 PROV-DM: the PROV data model Vocabulary Usage PROV-N
56 DBpedia Vocabulary Usage PROV-O
57 AERS-LD Vocabulary Usage PROV-O
58 TWC Healthdata Vocabulary Usage PROV-O
59 University of Southampton Open Data Vocabulary Usage PROV-XML
60 SIGNA Vocabulary Usage PROV-O
61 OECD Linked Data Vocabulary Usage PROV-O
62 prov-check Validator PROV-O
63 checker.pl Validator PROV-XML
64 ProvValidator Validator PROV-O, PROV-N, PROV-XML, PROV-JSON

3. Feature Coverage

3.1 PROV Usage

This section enumerates the PROV-DM terms [PROV-DM] that are consumed (Consume Icon), produced (Produce Icon), or both consumed and produced (Consume and Produce Icon) by a particular implementation.

Hover over the numbers to see the implementation name.
Table 2: Coverage of PROV-DM terms in implementations of type Application, Framework / API, or Service.
PROV Component Term #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23 #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35 #36 #37 #38 #39 #40 Term
C1: Entities/Activities Entity 2 19 18 Entity
Activity 3 17 18 Activity
Generation 3 13 19 Generation
Usage 3 13 20 Usage
Communication 2 6 4 Communication
Start 2 10 6 Start
End 2 10 7 End
Invalidation 1 6 Invalidation
C2: Derivations Derivation 2 11 13 Derivation
Revision 2 7 3 Revision
Quotation 1 6 Quotation
Primary Source 2 7 1 Primary Source
C3: Agents Agent 2 17 18 Agent
Attribution 1 9 9 Attribution
Association 2 9 18 Association
Delegation 1 7 6 Delegation
Plan 3 6 11 Plan
Person 2 9 6 Person
Organization 1 9 2 Organization
SoftwareAgent 2 8 8 SoftwareAgent
Influence 1 5 3 Influence
C4: Bundles Bundle 1 8 6 Bundle
C5: Alternate Alternate 1 6 3 Alternate
Specialization 2 7 3 Specialization
C6: Collections Collection 6 2 Collection
EmptyCollection 4 1 EmptyCollection
Membership 7 1 Membership
Other Elements Identififer 1 7 6 Identififer
Attribute 2 7 8 Attribute
Label 1 7 5 Label
Location 1 5 3 Location
Role 2 5 2 Role
Type 1 8 8 Type
Value 1 7 7 Value

3.2 PROV Usage by Extension

Table 3: PROV Terms extended by Vocabularies.
PROV Component Term #41 #42 #43 #44 #45 #46 #47 #48 #49 #50 #51 #52 #53 Total
C1: Entities/Activities Entity 12
Activity 12
Generation 11
Usage 11
Communication 3
Start 4
End 5
Invalidation 2
C2: Derivations Derivation 5
Revision 3
Quotation 2
Primary Source 2
C3: Agents Agent 10
Attribution 7
Association 9
Delegation 3
Plan 9
Person 3
Organization 2
SoftwareAgent 5
Influence 6
C4: Bundles Bundle 3
C5: Alternate Alternate 2
Specialization 2
C6: Collections Collection 1
EmptyCollection 0
Membership 1
Other Elements Identififer 0
Attribute 2
Label 1
Location 5
Role 5
Type 1
Value 3

3.3 PROV Usage in Datasets

Table 4: PROV Terms used by Datasets and Vocabularies.
PROV Component Term #54 #55 #56 #57 #58 #59 #60 #61 Total
C1: Entities/Activities Entity 6
Activity 6
Generation 6
Usage 6
Communication 4
Start 3
End 3
Invalidation 1
C2: Derivations Derivation 6
Revision 2
Quotation 1
Primary Source 1
C3: Agents Agent 6
Attribution 5
Association 4
Delegation 3
Plan 3
Person 1
Organization 1
SoftwareAgent 2
Influence 1
C4: Bundles Bundle 1
C5: Alternate Alternate 2
Specialization 2
C6: Collections Collection 1
EmptyCollection 1
Membership 1
Other Elements Identififer 1
Attribute 1
Label 1
Location 2
Role 1
Type 1
Value 2

3.4 PROV Constraints Implementation

Table 5: PROV Contraints [PROV-CONSTRAINTS] implemented by validators or other software
Constraint #62 #63 #64
Constraint 22 (key-object)      
Constraint 23 (key-properties)      
Constraint 24 (unique-generation)      
Constraint 25 (unique-invalidation)      
Constraint 26 (unique-wasStartedBy)      
Constraint 27 (unique-wasEndedBy)      
Constraint 28 (unique-startTime)      
Constraint 29 (unique-endTime)      
Constraint 30 (start-precedes-end)      
Constraint 31 (start-start-ordering)      
Constraint 32 (end-end-ordering)      
Constraint 33 (usage-within-activity)      
Constraint 34 (generation-within-activity)      
Constraint 35 (wasInformedBy-ordering)      
Constraint 36 (generation-precedes-invalidation)      
Constraint 37 (generation-precedes-usage)      
Constraint 38 (usage-precedes-invalidation)      
Constraint 39 (generation-generation-ordering)      
Constraint 40 (invalidation-invalidation-ordering)      
Constraint 41 (derivation-usage-generation-ordering)      
Constraint 42 (derivation-generation-generation-ordering)      
Constraint 43 (wasStartedBy-ordering)      
Constraint 44 (wasEndedBy-ordering)      
Constraint 45 (specialization-generation-ordering)      
Constraint 46 (specialization-invalidation-ordering)      
Constraint 47 (wasAssociatedWith-ordering)      
Constraint 48 (wasAttributedTo-ordering)      
Constraint 49 (actedOnBehalfOf-ordering)      
Constraint 50 (typing)      
Constraint 51 (impossible-unspecified-derivation-generation-use)      
Constraint 52 (impossible-specialization-reflexive)      
Constraint 53 (impossible-property-overlap)      
Constraint 54 (impossible-object-property-overlap)      
Constraint 55 (entity-activity-disjoint)      
Constraint 56 (membership-empty-collection)      
Note

The table above was produced from the results of running PROV-CONSTRAINTS test cases submitted by implementers.

4. Implementations Exchanging Provenance

Table 6: Implementations exchanging PROV with the representation(s) exchanged in parentheses. An implementation that supports every feature defined by the representation is denoted with All.
  Producer: ProvToolbox PROVoKing Provenenane Server APROVeD PROV Python ProvValidator PROV-DM
Consumer
Provenance Server All (PROV-JSON)         All (PROV-JSON)  
ProvValidator   All (PROV-O) All (PROV-N) Partial (PROV-N) All (PROV-N)   All (PROV-N)
prov-check All (PROV-O)            

Summary:

Table 7: Features exchanged by PROV implementation pairs.
PROV Component Producer: #11 #7 Total #17 #26 #55 Total
Consumer: #64 #62 #64 #64 #64
Term PROV-O PROV-N
C1: Entities/Activities Entity 2 3
Activity 2 3
Generation 2 3
Usage 2 3
Communication 2 2
Start 2 2
End 2 2
Invalidation 2 2
C2: Derivations Derivation 2 3
Revision 2 2
Quotation 2 2
Primary Source 2 2
C3: Agents Agent 2 3
Attribution 2 2
Association 2 3
Delegation 2 2
Plan 2 2
Person 2 2
Organization 2 3
SoftwareAgent 2 2
Influence 2 2
C4: Bundles Bundle 2 3
C5: Alternate Alternate 2 2
Specialization 2 2
C6: Collections Collection 2 2
EmptyCollection 2 2
Membership 2 2
Other Elements Identififer 2 2
Attribute 2 2
Label 2 2
Location 2 2
Role 2 3
Type 2 3
Value 2 2
Note

Although the Provenance Server (#17) and the ProvValidator (#64) are from the same institution (i.e. University of Southampton), it is worth noting that they were built from two independent code bases (one in Python and the other in Java)

A. Acknowledgements

We would like to thank the following who reported their PROV implementations to us: Ali Mufajjul, Amir Sezavar Keshavarz, Ashley Smith, Chris Baillie, Clément Caron, Daniel Garijo, Danius Michaelides, David Corsar, Edoardo Pignotti, Eric Stephan, Hook Hua, Irene Celino, Jacco van Ossenbruggen, James Cheney, James McCusker, Jens Lehmann, Jun Zhao, Kerry Taylor, Khalid Belhajjame, Landong Zuo, Luc Moreau, Luis M. Vilches-Blázquez, Michael Jewell, Mohamed Morsey, Olaf Hartig, Palma Raul, Paolo Missier, Paul Groth, Peer Brauer, Peter Slaughter, Reza B'Far, Rinke Hoekstra, Sara Magliacane, Sarven Capadisli, Satya Sahoo, Simon Miles, Spyros Kotoulas, Stephan Zednik, Stian Soiland-Reyes, Timothy Lebo, Tom De Nies, Trung Dong Huynh, Victor Rodriguez, Yanfeng Shu.

B. References

B.1 Informative references

[PROV-CONSTRAINTS]
James Cheney; Paolo Missier; Luc Moreau; eds. Constraints of the PROV Data Model. 12 March 2013, W3C Proposed Recommendation. URL: http://www.w3.org/TR/2013/PR-prov-constraints-20130312/
[PROV-DM]
Luc Moreau; Paolo Missier; eds. PROV-DM: The PROV Data Model. 12 March 2013, W3C Proposed Recommendation. URL: http://www.w3.org/TR/2013/PR-prov-dm-20130312/
[PROV-N]
Luc Moreau; Paolo Missier; eds. PROV-N: The Provenance Notation. 12 March 2013, W3C Proposed Recommendation. URL: http://www.w3.org/TR/2013/PR-prov-n-20130312/
[PROV-O]
Timothy Lebo; Satya Sahoo; Deborah McGuinness; eds. PROV-O: The PROV Ontology. 12 March 2013, W3C Proposed Recommendation. URL: http://www.w3.org/TR/2013/PR-prov-o-20130312/
[PROV-OVERVIEW]
Paul Groth; Luc Moreau; eds. PROV-OVERVIEW: An Overview of the PROV Family of Documents. 12 March 2013, Working Draft. URL: http://www.w3.org/TR/2013/WD-prov-overview-20130312/