RDF Spaces and Datasets

This specification introduces the notion of RDF spaces—places to store RDF triples—and defines a set of mechanisms expressing and manipulating information about them. Examples of RDF spaces include: an HTML page with embedded RDFa or microdata, a file containing RDF/XML or Turtle data, and a SQL database viewable as RDF using R2RML. RDF spaces are a generalization of SPARQL's named graphs, providing a standard model with formal semantics for systems which manage multiple collections of RDF data.

Editor's Draft Status

Closing in on FPWD IMHO, but not there yet. The "@@@" flags mark the places where I'm pretty sure something is needed before FPWD.

This text might be re-factored into other the other RDF documents. The Use Cases and Example would probably end up in a WG Note.

Introduction

The Resource Description Framework (RDF) provides a simple declarative way to store and transmit information. It also provides a trivial but effective way to combine information from multiple sources, with graph merging. This allows information from different people, different organizations, different units within an organization, different servers, different algorithms, etc, to all be combined and used together, without any special processing or understanding of the relationships among the providers.

For some applications, the basic RDF merge operation is overly simplistic, as extra processing and an understanding of the relationships among the providers may be useful. This document specifies a way to conveniently handle information coming from multiple sources, by modeling each one as a separate space, and using RDF to express information about these spaces. In addition to this important concept, we provide a pair of languages—extensions to existing RDF syntaxes— which can be used to store or transmit in one document the contents of multiple spaces as well as information about them.

This approach allows for a variety of use cases (immediately below) to be addressed in a straightforward manner, as shown in .

Use Cases

Each of these use cases is initally described in terms of the following scenario. Details of how each use case might be addressed using the technologies specified in this document are in .

The Example Foundation is a large organization with more than ten thousand employees and volunteers, spread out over five continents. It has branches in 25 different countries, and those divisions have considerable autonomy; they are only loosely controlled by the parent organization (called "headquarters" or "HQ") in Geneva.

HQ wants to help the divisions work together better. It decides a first step is to provide a simple but complete directory of all the Example personnel. Until now, each division has maintained its own directory, using its own technology. HQ wants to gather them all together, building a federated phonebook. They want to be able to find someone's phone number, mailing address, and job title, knowing only their name or email addresses. Later, they hope to extend the system to allow finding people based on their areas of interest and expertise.

HQ understands that people will want access to the phonebook in many different computing environments and with different languages, social norms, and application styles. Users are going to want at least one Web based user interface (UI), but they will also want mobile UIs for different platforms, desktop UIs for different platforms, and even to look up information via text messaging. HQ does not have the resources to build all of these, so they intend to provide direct access to the data so that the divisions can do it themselves as needed.

Each of the sections below, after the first, contains a new requirement, something additional that users in this scenario want the system to do. Each of these will motivate the features of the technologies specified in this rest of document.

Baseline Solution (Just Triples)

As a starting point, HQ needs to gather data from each division and re-publish it, in one place, for use by the different UIs.

This is a general use case for RDF, with no specific need for using spaces or datasets. It simply involves divisions pubishing RDF data on the web (with some common vocabulary and with access control), then HQ merging it and putting it on their website (with access control).

For an example of how this baseline could be implemented, see