This document describes a model for clients and servers to be able to efficiently retrieve large Linked Data Platform Resources representations by splitting up the responses into separate URL-addressable page resources.

Introduction

This specification provides a widely re-usable pattern to deal with large resources. Depending on the server’s capabilities, a GET request on a resource can be redirected to a subset of the resource (one page), that provides access to subsequent pages (see ).

For context and background, it could be useful to read Linked Data Platform Use Case and Requirements [[LDP-UCR]] and .

Terminology

Terminology is based on W3C's Architecture of the World Wide Web [[WEBARCH]], Hyper-text Transfer Protocol [[HTTP11]] and Linked Data Platform [[LDP]].

Paged resource
A LDP-RS whose representation may be too large to fit in a single HTTP response, for which an LDP server offers a sequence of single-page LDP-RSs. When the representations of the sequence's resources are combined by a client, the client has a (potentially incomplete or incoherent) copy of the paged resource's state. If a paged resource P is a LDP-RS and is broken into a sequence of pages (single-page resources) P1, P2, ...,Pn, the representation of each Pi contains a subset of the triples in P. LDP allows paging of resources other than LDP-RSs, but does not specify how clients combine their representations. See for additional details. For readers familiar with paged feeds [[RFC5005]], a paged resource is similar to a logical feed. Any resource could be considered to be a paged resource consisting of exactly one page, although there is no known advantage in doing so.

Single-page resource
One of a sequence of related LDP-RSs P1, P2, ...,Pn, each of which contains a subset of the state of another resource P. P is called the paged resource. For readers familiar with paged feeds [[RFC5005]], a single-page resource is similar to a feed document and the same coherency/completeness considerations apply. LDP provides no guarantees that the sequence is stable.

Note: the choice of terms was designed to help authors and readers clearly differentiate between the resource being paged, and the individual page resources, in cases where both are mentioned in close proximity.

first page link
A link to the first single-page resource of a paged resource P. Syntactically, a HTTP Link <P1>; rel='first' header [[!RFC5988]].

next page link
A link to the next single-page resource of a paged resource P. Syntactically, a HTTP Link <Pi>; rel='next' header [[!RFC5988]] where the target URI is Pi=2...n.

last page link
A link to the last single-page resource of a paged resource P. Syntactically, a HTTP Link <Pn>; rel='last' header [[!RFC5988]].

previous page link
A link to the previous single-page resource of a paged resource P. Syntactically, a HTTP Link <Pi>; rel='prev' header [[!RFC5988]] where the target URI is Pi=1...n-1.

Conventions Used in This Document

Sample resource representations are provided in text/turtle format [[turtle]].

Commonly used namespace prefixes:

	@prefix dcterms: <http://purl.org/dc/terms/>.
	@prefix foaf:     <http://xmlns.com/foaf/0.1/>.
	@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
	@prefix ldp:     <http://www.w3.org/ns/ldp#>.
	@prefix xsd:     <http://www.w3.org/2001/XMLSchema#>.

The status of the sections of Linked Data Platform Paging 1.0 (this document) is as follows:

TODO: Update this section list

A conforming LDP Paging client is a conforming LDP client [[!LDP]] that follows the rules defined by LDP.

A conforming LDP Paging server is a conforming LDP server [[!LDP]] that follows the rules defined by LDP.

Linked Data Platform Paging Clients

All of the following are conformance rules for LDP Paging Clients.

General

LDP Paging clients MUST be paging-aware in cases where LDP Paging servers initiate paging.

TODO: Confirm this client MUST for server-initiated paging

Linked Data Platform Resources

Introduction

Linked Data Platform Resources (LDPRs) are HTTP resources that conform to the simple patterns and conventions in this section. HTTP requests to access, modify, create or delete LDPRs are accepted and processed by LDP servers. Most LDPRs are domain-specific resources that contain data for an entity in some domain, which could be commercial, governmental, scientific, religious, or other.

Some of the rules defined in this document provide clarification and refinement of the base Linked Data rules [[LINKED-DATA]]; others address additional needs.

The following sections define the conformance rules for LDP servers when serving LDPRs. This document also explains how a server paginates an LDP-RS's representation if it gets too big. Companion non-normative documents describe additional guidelines for use when interacting with LDPRs.

Paging

Introduction

It sometimes happens that a resource is too large to reasonably transmit its representation in a single HTTP response. To address this problem, servers should support a technique called Paging. When a client retrieves such a resource, the server redirects the client to a "first page" resource, and includes in its response a link to the next part of the resource's state, all at a URLs of the server's choosing. The triples in the representation of the each page of an LDPR are typically a subset of the triples from the paged resource.

LDP servers may respond to requests for a resource by redirecting to the first page of the resource and, with that, returning a Link <next-page-URL>;type='next' header containing the URL for the next page. Clients inspect each response for the link header to see if additional pages are available; paging does not affect the choice of HTTP status code. Note that paging is lossy, as described in [[RFC5005]], and so (as stated there) clients should not make any assumptions about a set of pages being a complete or coherent snapshot of a resource's state.

Looking at an example resource representing Example Co.'s customer relationship data, identified by the URI http://example.org/customer-relations, we’ll split the response across two pages. The client retrieves http://example.org/customer-relations, and the server responds with status code 333 (Returning Related), a Location: http://example.org/customer-relations?page1 HTTP response header, and the following representation:

# The following is the representation of
#    http://example.org/customer-relations?page1
#    Requests on the URI will result in responses that include the following HTTP header
#       Link: <http://example.org/customer-relations?p=2>; rel='next'
#    This Link header is how clients discover the URI of the next page in sequence,
#    and that the resource supports paging.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix ldp: <http://www.w3.org/ns/ldp#>.
@prefix o: <http://example.org/ontology/>.
@base <http://example.org/customer-relations>.

<>
   a o:CustomerRelations;
   dcterms:title "The customer information for Example Co.";
   o:client <#JohnZSmith>, <#BettyASmith>, <#JoanRSmith>. 

<#JohnZSmith>
   a foaf:Person;
   o:status o:ActiveCustomer;
   foaf:name "John Z. Smith".
<#BettyASmith>
   a foaf:Person;
   o:status o:PreviousCustomer;
   foaf:name "Betty A. Smith".
 <#JoanRSmith>
   a foaf:Person;
   o:status o:PotentialCustomer;
   foaf:name "Joan R. Smith".

Because the server includes a Link: <http://example.org/customer-relations?p=2>; rel='next' response header, and the status code is 3xx (333 (Returning Related) in this case), the client knows that more data exists and where to find it. The server determines the size of the pages using application specific methods not defined within this specificiation. The next page link's target URI is also defined by the server and not this specification.

The following example is the result of retrieving the next page; the server responds with status code 200 (OK) and the following representation:

# The following is the representation of
#  http://example.org/customer-relations?p=2
#
#  There is no "next" Link in the server's response, so this is the final page.
#
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix ldp: <http://www.w3.org/ns/ldp#>.
@prefix o: <http://example.org/ontology/>.
@base <http://example.org/customer-relations>.

<>
   o:client <#GlenWSmith>, <#AlfredESmith>. 
 
<#GlenWSmith>
   a foaf:Person;
   o:status o:ActiveCustomer, o:GoldCustomer;
   foaf:name "Glen W. Smith".

<#AlfredESmith>
   a foaf:Person;
   o:status o:ActiveCustomer, o:BronzeCustomer;
   foaf:name "Alfred E. Smith".
 

In this example, there are only two customers provided in the final page. To indicate this is the last page, the server omits the Link: rel='next' header in its response.

As mentioned above, retrieving all the pages offered by a server gives no guarantee to a client that it knows the entire state of the server. For example, after the server constructs the the first page representation, another actor could delete client#BettyASmith from the server.

HTTP GET

In addition to the requirements set forth in on HTTP GET, LDP servers that support paging must also follow the requirements in this section for all paged resources and their associated single-page resources.

LDP servers SHOULD allow clients to retrieve large LDP-RSs in pages. See for additional requirements associated with paged resources.

LDP servers MAY treat any resource (LDP-RS or not) as a paged resource. See for additional details.

LDP servers MAY add single-page resources to a paged resource's sequence over time, but SHOULD only add pages to the end of a sequence.

Non-normative note: As a result, clients retrieving any single-page resource several times can observe its place in the sequence change as the state of the paged resource changes. For example, a nominally last page's server might provide a next page link when the page is retrieved. Similar situations arise when the page sequence crosses server boundaries; server A might host the initial portion of a sequence that links to the last page server A is aware of, hosted on server B, and server B might extend the sequence of pages.

LDP servers MAY provide a first page link when responding to requests with any single-page resource as the Request-URI.

LDP servers MAY provide a last page link in responses to GET requests with any single-page resource as the Request-URI.

LDP servers MUST provide a next page link in responses to GET requests with any single-page resource other than the final page as the Request-URI. This is the mechanism by which clients can discover the URL of the next page.

LDP servers MUST NOT provide a next page link in responses to GET requests with the final single-page resource as the Request-URI. This is the mechanism by which clients can discover the end of the page sequence as currently known by the server.

LDP servers MAY provide a previous page link in responses to GET requests with any single-page resource other than the first page as the Request-URI. This is one mechanism by which clients can discover the URL of the previous page.

LDP servers MUST NOT provide a previous page link in responses to GET requests with the first single-page resource as the Request-URI. This is one mechanism by which clients can discover the beginning of the page sequence as currently known by the server.

LDP servers MUST provide an HTTP Link header whose target URI is http://www.w3.org/ns/ldp#Page, and whose link relation type is type [[!RFC5988]] in responses to GET requests with any single-page resource as the Request-URI. This is one mechanism by which clients know that the resource is one of a sequence of pages.

Feature At Risk

The LDP Working Group proposes incorporation of the features described in the next compliance clause.

  • A TAG discussion has started, whose goal is to reduce the need for two request-response round trips down to one when retrieving what turns out to be the first page of a paged resource, by defining a new HTTP response code in the 2xx or 3xx class that would allow a server to respond to GET request-uri requests with the representation of the first page (whose URI is first-page-uri, not request-uri) of a multi-page response.

  • For the purposes of drafting this section, we assume that the new code's value is 333 (Returning Related), an LDP extrapolation from TAG discussions, and its definition is given by Henry Thompson's strawman, with the substitution of 333 for 2xx. Note: nothing prevents servers or clients from using 303 See Other responses to requests for paged resources. The only significant difference between 303 and 333 responses is the extra round trip required for the client to retrieve the representation of the first page when using 303.

  • Once LDP-Paging is a Candidate Recommendation, the LDP WG will make an assessment based on the status at IETF, working with the W3C Director, to either use the newly defined response code as documented in this section or to revert to a classic 303 response pattern.

LDP servers SHOULD respond with HTTP status code 333 (Returning Related) to successful GET requests with any paged resource as the Request-URI, although any appropriate code MAY be used.

Linked Data Platform Containers

Introduction

Many HTTP applications and sites have organizing concepts that partition the overall space of resources into smaller containers. Blog posts are grouped into blogs, wiki pages are grouped into wikis, and products are grouped into catalogs. Each resource created in the application or site is created within an instance of one of these container-like entities, and users can list the existing artifacts within one. LDP Paging Containers answer some basic questions, which are:

  1. How is the order of the container entries expressed?

Ordering

There are many cases where an ordering of the members of the container is important. LDPC does not provide any particular support for server ordering of members in containers, because any client can order the members in any way it chooses based on the value of any available property of the members. In the example below, the value of the o:value predicate is present for each member, so the client can easily order the members according to the value of that property. In this way, LDPC avoids the use of RDF constructs like Seq and List for expressing order.

Order becomes more important for LDP servers when containers are paginated. If the server does not respect ordering when constructing pages, the client would be forced to retrieve all pages before sorting the members, which would defeat the purpose of pagination. In cases where ordering is important, an LDPC server exposes all the members on a page with the proper sort order with relation to all members on the next and previous pages. When the sort is ascending, all the members on a current page have a sort order no lower than all members on the previous page and sort order no higher than all the members on the next page; that is, it proceeds from low to high, but keep in mind that several consecutive pages might have members whose sort criteria are equal. When the sort is descending, the opposite order is used. Since more than one value may be used to sort members, the LDPC specification allows servers to assert the ordered list of sort criteria used to sort the members, using the ldp:containerSortCriteria relation. Each member of the ordered list exposes one ldp:containerSortCriterion, consisting of a ldp:containerSortOrder, ldp:containerSortPredicate, and optionally a ldp:containerSortCollation.

Here is an example container described previously, with representation for ordering of the assets:

# The following is the ordered representation of
#   http://example.org/netWorth/nw1/assetContainer/

# @base <http://example.org/netWorth/nw1/assetContainer/>
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix ldp: <http://www.w3.org/ns/ldp#>.
@prefix o: <http://example.org/ontology/>.

<>
   a ldp:DirectContainer;
   dcterms:title "The assets of JohnZSmith";
   ldp:membershipResource <http://example.org/netWorth/nw1>;
   ldp:hasMemberRelation o:asset;
   ldp:insertedContentRelation ldp:MemberSubject.

<?firstPage>
   a ldp:Page;
   ldp:pageOf <>;
   ldp:containerSortCriteria (<#SortValueAscending>).

<#SortValueAscending>
   a ldp:ContainerSortCriterion;
   ldp:containerSortOrder ldp:Ascending;
   ldp:containerSortPredicate o:value.

<http://example.org/netWorth/nw1>
   a o:NetWorth;
   o:asset <a1>, <a3>, <a2>.

<a1>
   a o:Stock;
   o:value 100.00 .
<a2>
   a o:Cash;
   o:value 50.00 .
<a3>
   a o:RealEstateHolding;
   o:value 300000 .

As you can see by the addition of the ldp:ContainerSortCriteria predicate, the o:value predicate is used to order the page members in ascending order. It is up to the domain model and server to determine the appropriate predicate to indicate the resource’s order within a page, and up to the client receiving this representation to use that order in whatever way is appropriate, for example to sort the data prior to presentation on a user interface. Also as it is possible for a container to have as its members other containers, the ordering approach doesn't change as containers themselves are LDPRs and the properties from the domain model can be leveraged for the sort criteria.

General

The Linked Data Platform does not define how clients discover LDPCs.

Each Linked Data Platform Paging Container MUST also be a conforming Linked Data Platform Container.

(From [[!LDP]])LDP servers SHOULD respect all of a client's LDP-defined hints, for example which subsets of LDP-defined state the client is interested in processing, to influence the set of triples returned in representations of an LDPC, particularly for large LDPCs [[!LDP]]. Non-normative note: the LDPC might be paged; paged resources provide no guarantee that all triples of a given subset, for example containment triples, are grouped together on one page or on a sequence of consecutive pages (see ).

HTTP GET

LDP servers MAY represent the members of a paged LDPC in a sequential order. If the server does so, it MUST specify the order using a triple whose subject is the page URI, whose predicate is ldp:containerSortCriteria, and whose object is a rdf:List of ldp:containerSortCriterion resources. The resulting order MUST be as defined by SPARQL SELECT’s ORDER BY clause [[!sparql11-query]]. Sorting criteria MUST be the same for all pages of a representation; if the criteria were allowed to vary, the ordering among members of a container across pages would be undefined. The first list entry provides the primary sorting criterion, any second entry provides a secondary criterion used to order members considered equal according to the primary criterion, and so on. See for an example.

LDPC page representations ordered using ldp:containerSortCriteria MUST contain, in every ldp:containerSortCriterion list entry, a triple whose subject is the sort criterion identifier, whose predicate is ldp:containerSortPredicate and whose object is the predicate whose value is used to order members between pages (the page-ordering values). The only literal data types whose behavior LDP constrains are those defined by SPARQL SELECT’s ORDER BY clause [[!sparql11-query]]. Other data types can be used, but LDP assigns no meaning to them and interoperability will be limited.

LDPC page representations ordered using ldp:containerSortCriteria MUST contain, in every ldp:containerSortCriterion list entry, a triple whose subject is the sort criterion identifier, whose predicate is ldp:containerSortOrder and whose object describes the order used. LDP defines two values, ldp:Ascending and ldp:Descending, for use as the object of this triple. Other values can be used, but LDP assigns no meaning to them and interoperability will be limited.

LDPC page representations ordered using ldp:containerSortCriteria MAY contain, in any ldp:containerSortCriterion list entry, a triple whose subject is the sort criterion identifier, whose predicate is ldp:containerSortCollation and whose object identifies the collation used. LDP defines no values for use as the object of this triple. While it is better for interoperability to use open standardized values, any value can be used. When the ldp:containerSortCollation triple is absent and the page-ordering values are strings or simple literals [[!sparql11-query]], the resulting order is as defined by SPARQL SELECT’s ORDER BY clause [[!sparql11-query]] using two-argument fn:compare, that is, it behaves as if http://www.w3.org/2005/xpath-functions/collation/codepoint was the specified collation. When the ldp:containerSortCollation triple is present and the page-ordering values are strings or simple literals [[!sparql11-query]], the resulting order is as defined by SPARQL SELECT’s ORDER BY clause [[!sparql11-query]] using three-argument fn:compare, that is, the specified collation. The ldp:containerSortCollation triple MUST be omitted for comparisons involving page-ordering values for which [[!sparql11-query]] does not use collations.

Notable information from normative references

While readers, and especially implementers, of LDP are assumed to understand the information in its normative references, the working group has found that certain points are particularly important to understand. For those thoroughly familiar with the referenced specifications, these points might seem obvious, yet experience has shown that few non-experts find all of them obvious. This section enumerates these topics; it is simply re-stating (non-normatively) information locatable via the normative references.

Feed paging and archiving

Reference: [[RFC5005]]

A LDP client SHOULD NOT present paged resources as coherent or complete, or make assumptions to that effect. [[RFC5005]].

HTTP Header Definitions

TBD

Security Considerations

As with any protocol that is implemented leveraging HTTP, implementations should take advantage of the many security-related facilities associated with it and are not required to carry out LDP operations that may be in contradistinction to a particular security policy in place. For example, when faced with an unauthenticated request to replace system critical RDF statements in a graph through the PUT method, applications may consider responding with the 401 status code (Unauthorized), indicating that the appropriate authorization is required. In cases where authentication is provided fails to meet the requirements of a particular access control policy, the 403 status code (Forbidden) can be sent back to the client to indicate this failure to meet the access control policy.

Acknowledgements

The following people have been instrumental in providing thoughts, feedback, reviews, content, criticism and input in the creation of this specification:

Alexandre Bertails, Andrei Sambra, Andy Seaborne, Antonis Loizou, Arnaud Le Hors, Ashok Malhota, Bart van Leeuwen, Cody Burleson, David Wood, Eric Prud'hommeaux, Erik Wilde, Henry Story, John Arwe, Kevin Page, Kingsley Idehen, Mark Baker, Martin P. Nally, Miel Vander Sande, Miguel Esteban Gutiérrez, Nandana Mihindukulasooriya, Olivier Berger, Pierre-Antoine Champin, Raúl García Castro, Reza B'Far, Richard Cyganiak, Roger Menday, Ruben Verborgh, Sandro Hawke, Serena Villata, Sergio Fernandez, Steve Battle, Steve Speicher, Ted Thibodeau, Tim Berners-Lee, Yves Lafon

Change History

The change history is up to the editors to insert a brief summary of changes, ordered by most recent changes first and with heading from which public draft it has been changed from.

February 18, 2014 Editor's Draft