Adding more rational and description of use-cases and alternatives.
authorGregg Kellogg <gregg@kellogg-assoc.com>
Wed, 26 Oct 2011 17:01:45 -0700
changeset 9 b296d3ba308a
parent 8 a11c18d7c5f0
child 10 74bd1c88b77d
Adding more rational and description of use-cases and alternatives.
microdata-rdf/index.html
--- a/microdata-rdf/index.html	Wed Oct 26 13:12:35 2011 -0700
+++ b/microdata-rdf/index.html	Wed Oct 26 17:01:45 2011 -0700
@@ -124,9 +124,9 @@
 
           // if there is a previously published draft, uncomment this and set its YYYY-MM-DD date
           // and its maturity status
-          //previousPublishDate:  "2011-08-17",
-          //previousMaturity:     "ED",
-          //previousDiffURI:      "http://json-ld.org/spec/ED/20110817/index.html",
+          previousPublishDate:  "2011-08-11",
+          previousMaturity:     "ED",
+          previousDiffURI:      "https://dvcs.w3.org/hg/htmldata/raw-file/24af1cde0da1/microdata-rdf/index.html",
           //diffTool:             "http://www.aptest.com/standards/htmldiff/htmldiff.pl",
 
           // if there a publicly available Editor's Draft, this is the link
@@ -259,6 +259,87 @@
     is an extension to HTML used to embed machine-readable data to HTML documents. This specification describes
     transformation directly to RDF [[RDF-CONCEPTS]].
   </p>
+
+<section>
+  <h2>Background</h2>
+  <p>Microdata is a way of expressing metadata in HTML documents using attributes. A previous version
+    of Microdata [[!MICRODATA]] included rules for generating RDF, but current Editor's Drafts have removed
+    the explicit transformation procedure. Microdata is now used as an API to access data from within
+    an HTML DOM and as a JSON serialization.</p>
+  <p>The original RDF transformation process created URIs for properties that are expressed as non-absolute
+    URIs. The algorithm was designed to create URIs which were distinct based on the relationship between
+    <aref>itemtype</aref> and <aref>itemprop</aref> contexts. This is required, as the Microdata data model
+    requires that properties maintain distinct semantic meanings in different contexts. However, this
+    form of URI generation is typically different than that used within RDF vocabularies, where
+    properties typically have a common meaning within a given vocabulary.</p>
+  <p>Microdata also specifies that items are values are ordered, which is not typically the case for RDF
+    vocabularies. In fact, unless a property has an <code>rdfs:range</code> of <code>rdf:List</code>, or is
+    unspecified, it may not be appropriate to generate an <tref>RDF Collection</tref>.</p>
+  <p>The Microdata JSON serialization does not retain type or language information that might be derived
+    from the HTML DOM. The RDF Transformation does retain both type and language information when it
+    is available.</p>
+  <p>This specification is an update to the original RDF transformation process in addition to
+    vocabulary-specific rules that affect the generation of property URIs and value serializations.
+    This is facilitated by a registry that associates URIs with specific rules based on matching
+    <aref>itemtype</aref> values against registered URI prefixes do determine a vocabulary and
+    vocabulary-specific processing rules.</p>
+</section>
+
+<section>
+  <h2>Use Cases</h2>
+  <p>During the period of the task force, a number of use cases were put forth for the use of Microdata
+    in generating RDF:</p>
+  <ul>
+    <li>Semantic search engines such as <a href="http://sindice.com/">Sindice</a> use RDF as their backend data model.
+      They want to gather information expressed using microdata alongside information expressed in RDF-based formats
+      and make it available to others to use, as a service. In these cases, the ultimate consumer, who will need to
+      understand the vocabularies used within the microdata, is the program or person who pulls out data from Sindice.
+      Sindice needs to retain the distinctions in the original Microdata (e.g. ordering of items) and might not have
+      built-in knowledge about the vocabulary of interest to the ultimate consumer. In this case, the ultimate consumer
+      is likely to have to map/validate/handle errors in the data they get from Sindice.</li>
+    <li>A consumer such as <a href="http://openelectiondata.org">openelectiondata.org</a> wants to support
+      Microdata-based markup of their vocabulary as well as RDFa-based markup, both going into an RDF-based data store.
+      They want to use an off-the-shelf tool to extract the microdata. They want to configure the tool to give them the
+      RDF that is appropriate for their known vocabulary.</li>
+    <li>A browser plugin that captures data for the user uses an RDF model as its backend store.
+      Any time it encounters Microdata on a page, it wants to pull that Microdata into the store on the fly.</li>
+    <li><a href="http://purl.org/goodrelations/">GoodRelations</a> require properties to be generated
+      in a flat namespace, not place multiple values within a container. Ideally, a processor would make use
+      of<code>rdfs:range</code> declarations at parse time so properly typed literals could be constructed. It also
+      requires that plain literals retain language information in scope on the HTML element, as it is common that
+      multiple values will be used to specify the same information in different languages. Collection.</li>
+    <li><a href="http://schema.org/">Schema.org</a> has an 
+      <a href="http://schema.org/docs/extension.html">extension mechanism</a> to allow authors to express information
+      that is more detail than the pre-defined types, properties and enumerations. Property URIs are all in the same
+      flat-namespace as types, but authors can add more detail by using a '/' after the type or property to provide
+      more detail. For example, schema.org defines a <em>musicGroupMember</em> property having a URI of
+      <code>http://schema.org/musicGroupMember</code>, and an author might express more detail through an ad-hoc
+      sub-property <em>musicGroupMember/leadVocalist</em>, having the URI
+      <code>http://schema.org/musicGroupMember/leadVocalist</code>.</li>
+    <li></li>
+  </ul>
+</section>
+
+<section>
+  <h2>Goals</h2>
+  <p>The purpose of this specification is to provide input to a future working group that can make decisions
+    about the need for a registry and the details of processing. Among the options investigated by
+    the Task Force are the following:</p>
+  <ul>
+    <li>Property URI generation using the original Microdata specification with a base URI and fragment
+      made up of the in-scope <aref>itemtype</aref> and <aref>itemprop</aref> elements.</li>
+    <li>Vocabulary-based URI generation, where the vocabulary is determined from the in-scope itemtype,
+      either through an algorithmic modification of the type URI or by matching the URI against a registry.
+      The vocabulary URI is then used to generate property URIs in a namespace parallel to the type URI.</li>
+    <li>Type-based URI generation, where the URI of the in-scope <aref>itemtype</aref> forms the base of property URI
+      by adding the property to the type URI as a fragment.</li>
+    <li>When there are multiple top-level items in a document, place items in an RDF Collection.
+      Alternatively, simply list the items as multiple values, or do not generate an
+      <code>http://www.w3.org/1999/xhtml/microdata#item</code> mapping at all.</li>
+    <li>When there are multiple values for an <aref>itemprop</aref>, place items in an RDF Collection.
+      Alternatively, do not use collections, use an alternative such as <code>rdf:Seq</code>, or place all values,
+      whether or not multiple, into some form of collection.</li>
+  </ul>
 </section>
 
 <section>
@@ -315,23 +396,43 @@
       An attribute appropriate for use with the <code>meta</code> element for creating invisible properties.
     </dd>
     <dt><adef>data</adef></dt><dd>
-      An attribute appropriate for use with the <code>object</code> element for creating URI References.
+      An attribute appropriate for use with the <code>object</code> element for creating URI<tref>URI
+      reference</tref>s.
     </dd>
     <dt><adef>datetime</adef></dt><dd>
       An attribute appropriate for use with the <code>date</code> element for creating typed literals.
-      <div class="issue">The <code>date</code> element will likely be replaced with something more general purpose.</div>
+      <div class="issue">
+        The <code>date</code> element will likely be replaced with something more general purpose.
+      </div>
     </dd>
     <dt><adef>href</adef></dt><dd>
-      An attribute appropriate for use with <code>a</code>, <code>area</code> or <code>link</code> elements for creating URI References.
+      An attribute appropriate for use with <code>a</code>, <code>area</code> or <code>link</code> elements for
+      creating <tref>URI reference</tref>s.
     </dd>
     <dt><adef>src</adef></dt><dd>
-      An attribute appropriate for use with <code>audio</code>, <code>embed</code>, <code>iframe</code>, <code>img</code>,
-      <code>source</code>, <code>track</code>, or <code>video</code> elements for creating invisible properties.
+      An attribute appropriate for use with <code>audio</code>, <code>embed</code>, <code>iframe</code>,
+      <code>img</code>, <code>source</code>, <code>track</code>, or <code>video</code> elements for creating invisible
+      properties.
     </dd>
   </dl>
 </section>
 
 <section>
+  <h1>Vocabulary Registry</h1>
+  <p>In a perfect world, all processors would be able to generate the same output for a given input
+    without regards to the requirements of a particular vocabulary. However, Microdata doesn't really
+    provide sufficient syntactic help in making these decisions.</p>
+  <section>
+    <h2>Property URI Generation</h2>
+    <p></p>
+  </section>
+  <section>
+    <h2>Item/Value Ordering</h2>
+    <p></p>
+  </section>
+</section>
+
+<section>
   <h1>Algorithm</h1>
   <p>
     Transformation of Microdata to RDF makes use of general processing rules described in [[!MICRODATA]]