--- a/microdata-rdf/index.html Wed Oct 26 13:12:35 2011 -0700
+++ b/microdata-rdf/index.html Wed Oct 26 17:01:45 2011 -0700
@@ -124,9 +124,9 @@
// if there is a previously published draft, uncomment this and set its YYYY-MM-DD date
// and its maturity status
- //previousPublishDate: "2011-08-17",
- //previousMaturity: "ED",
- //previousDiffURI: "http://json-ld.org/spec/ED/20110817/index.html",
+ previousPublishDate: "2011-08-11",
+ previousMaturity: "ED",
+ previousDiffURI: "https://dvcs.w3.org/hg/htmldata/raw-file/24af1cde0da1/microdata-rdf/index.html",
//diffTool: "http://www.aptest.com/standards/htmldiff/htmldiff.pl",
// if there a publicly available Editor's Draft, this is the link
@@ -259,6 +259,87 @@
is an extension to HTML used to embed machine-readable data to HTML documents. This specification describes
transformation directly to RDF [[RDF-CONCEPTS]].
</p>
+
+<section>
+ <h2>Background</h2>
+ <p>Microdata is a way of expressing metadata in HTML documents using attributes. A previous version
+ of Microdata [[!MICRODATA]] included rules for generating RDF, but current Editor's Drafts have removed
+ the explicit transformation procedure. Microdata is now used as an API to access data from within
+ an HTML DOM and as a JSON serialization.</p>
+ <p>The original RDF transformation process created URIs for properties that are expressed as non-absolute
+ URIs. The algorithm was designed to create URIs which were distinct based on the relationship between
+ <aref>itemtype</aref> and <aref>itemprop</aref> contexts. This is required, as the Microdata data model
+ requires that properties maintain distinct semantic meanings in different contexts. However, this
+ form of URI generation is typically different than that used within RDF vocabularies, where
+ properties typically have a common meaning within a given vocabulary.</p>
+ <p>Microdata also specifies that items are values are ordered, which is not typically the case for RDF
+ vocabularies. In fact, unless a property has an <code>rdfs:range</code> of <code>rdf:List</code>, or is
+ unspecified, it may not be appropriate to generate an <tref>RDF Collection</tref>.</p>
+ <p>The Microdata JSON serialization does not retain type or language information that might be derived
+ from the HTML DOM. The RDF Transformation does retain both type and language information when it
+ is available.</p>
+ <p>This specification is an update to the original RDF transformation process in addition to
+ vocabulary-specific rules that affect the generation of property URIs and value serializations.
+ This is facilitated by a registry that associates URIs with specific rules based on matching
+ <aref>itemtype</aref> values against registered URI prefixes do determine a vocabulary and
+ vocabulary-specific processing rules.</p>
+</section>
+
+<section>
+ <h2>Use Cases</h2>
+ <p>During the period of the task force, a number of use cases were put forth for the use of Microdata
+ in generating RDF:</p>
+ <ul>
+ <li>Semantic search engines such as <a href="http://sindice.com/">Sindice</a> use RDF as their backend data model.
+ They want to gather information expressed using microdata alongside information expressed in RDF-based formats
+ and make it available to others to use, as a service. In these cases, the ultimate consumer, who will need to
+ understand the vocabularies used within the microdata, is the program or person who pulls out data from Sindice.
+ Sindice needs to retain the distinctions in the original Microdata (e.g. ordering of items) and might not have
+ built-in knowledge about the vocabulary of interest to the ultimate consumer. In this case, the ultimate consumer
+ is likely to have to map/validate/handle errors in the data they get from Sindice.</li>
+ <li>A consumer such as <a href="http://openelectiondata.org">openelectiondata.org</a> wants to support
+ Microdata-based markup of their vocabulary as well as RDFa-based markup, both going into an RDF-based data store.
+ They want to use an off-the-shelf tool to extract the microdata. They want to configure the tool to give them the
+ RDF that is appropriate for their known vocabulary.</li>
+ <li>A browser plugin that captures data for the user uses an RDF model as its backend store.
+ Any time it encounters Microdata on a page, it wants to pull that Microdata into the store on the fly.</li>
+ <li><a href="http://purl.org/goodrelations/">GoodRelations</a> require properties to be generated
+ in a flat namespace, not place multiple values within a container. Ideally, a processor would make use
+ of<code>rdfs:range</code> declarations at parse time so properly typed literals could be constructed. It also
+ requires that plain literals retain language information in scope on the HTML element, as it is common that
+ multiple values will be used to specify the same information in different languages. Collection.</li>
+ <li><a href="http://schema.org/">Schema.org</a> has an
+ <a href="http://schema.org/docs/extension.html">extension mechanism</a> to allow authors to express information
+ that is more detail than the pre-defined types, properties and enumerations. Property URIs are all in the same
+ flat-namespace as types, but authors can add more detail by using a '/' after the type or property to provide
+ more detail. For example, schema.org defines a <em>musicGroupMember</em> property having a URI of
+ <code>http://schema.org/musicGroupMember</code>, and an author might express more detail through an ad-hoc
+ sub-property <em>musicGroupMember/leadVocalist</em>, having the URI
+ <code>http://schema.org/musicGroupMember/leadVocalist</code>.</li>
+ <li></li>
+ </ul>
+</section>
+
+<section>
+ <h2>Goals</h2>
+ <p>The purpose of this specification is to provide input to a future working group that can make decisions
+ about the need for a registry and the details of processing. Among the options investigated by
+ the Task Force are the following:</p>
+ <ul>
+ <li>Property URI generation using the original Microdata specification with a base URI and fragment
+ made up of the in-scope <aref>itemtype</aref> and <aref>itemprop</aref> elements.</li>
+ <li>Vocabulary-based URI generation, where the vocabulary is determined from the in-scope itemtype,
+ either through an algorithmic modification of the type URI or by matching the URI against a registry.
+ The vocabulary URI is then used to generate property URIs in a namespace parallel to the type URI.</li>
+ <li>Type-based URI generation, where the URI of the in-scope <aref>itemtype</aref> forms the base of property URI
+ by adding the property to the type URI as a fragment.</li>
+ <li>When there are multiple top-level items in a document, place items in an RDF Collection.
+ Alternatively, simply list the items as multiple values, or do not generate an
+ <code>http://www.w3.org/1999/xhtml/microdata#item</code> mapping at all.</li>
+ <li>When there are multiple values for an <aref>itemprop</aref>, place items in an RDF Collection.
+ Alternatively, do not use collections, use an alternative such as <code>rdf:Seq</code>, or place all values,
+ whether or not multiple, into some form of collection.</li>
+ </ul>
</section>
<section>
@@ -315,23 +396,43 @@
An attribute appropriate for use with the <code>meta</code> element for creating invisible properties.
</dd>
<dt><adef>data</adef></dt><dd>
- An attribute appropriate for use with the <code>object</code> element for creating URI References.
+ An attribute appropriate for use with the <code>object</code> element for creating URI<tref>URI
+ reference</tref>s.
</dd>
<dt><adef>datetime</adef></dt><dd>
An attribute appropriate for use with the <code>date</code> element for creating typed literals.
- <div class="issue">The <code>date</code> element will likely be replaced with something more general purpose.</div>
+ <div class="issue">
+ The <code>date</code> element will likely be replaced with something more general purpose.
+ </div>
</dd>
<dt><adef>href</adef></dt><dd>
- An attribute appropriate for use with <code>a</code>, <code>area</code> or <code>link</code> elements for creating URI References.
+ An attribute appropriate for use with <code>a</code>, <code>area</code> or <code>link</code> elements for
+ creating <tref>URI reference</tref>s.
</dd>
<dt><adef>src</adef></dt><dd>
- An attribute appropriate for use with <code>audio</code>, <code>embed</code>, <code>iframe</code>, <code>img</code>,
- <code>source</code>, <code>track</code>, or <code>video</code> elements for creating invisible properties.
+ An attribute appropriate for use with <code>audio</code>, <code>embed</code>, <code>iframe</code>,
+ <code>img</code>, <code>source</code>, <code>track</code>, or <code>video</code> elements for creating invisible
+ properties.
</dd>
</dl>
</section>
<section>
+ <h1>Vocabulary Registry</h1>
+ <p>In a perfect world, all processors would be able to generate the same output for a given input
+ without regards to the requirements of a particular vocabulary. However, Microdata doesn't really
+ provide sufficient syntactic help in making these decisions.</p>
+ <section>
+ <h2>Property URI Generation</h2>
+ <p></p>
+ </section>
+ <section>
+ <h2>Item/Value Ordering</h2>
+ <p></p>
+ </section>
+</section>
+
+<section>
<h1>Algorithm</h1>
<p>
Transformation of Microdata to RDF makes use of general processing rules described in [[!MICRODATA]]