Abstract
This document is a glossary of terms defined and used to describe Linked Data, and its associated vocabularies and Best Practices. This document published by the W3C Government Linked Data Working Group as a Working Group Note, is intended to help information management professionals, Web developers, scientists and the general public better understand publishing structured data using Linked Data Principles.
Status of This Document
This section describes the status of this document at the time of its publication. Other
documents may supersede this document. A list of current W3C publications and the latest revision
of this technical report can be found in the W3C technical reports
index at http://www.w3.org/TR/.
This document was published by the Government Linked Data Working Group as a Working Draft.
This document is intended to become a W3C Recommendation.
If you wish to make comments regarding this document, please send them to
public-gld-comments@w3.org
(subscribe,
archives).
All comments are welcome.
Publication as a Working Draft does not imply endorsement by the W3C Membership.
This is a draft document and may be updated, replaced or obsoleted by other documents at
any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the
5 February 2004 W3C Patent Policy.
W3C maintains a public list of any patent disclosures
made in connection with the deliverables of the group; that page also includes instructions for
disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains
Essential Claim(s) must disclose the
information in accordance with section
6 of the W3C Patent Policy.
Scope
This glossary lists terms related to publishing and consuming either Linked Data in the enterprise or Linked Open Data on the public Web.
1. 5 Star Linked Data
5 Star Linked Data refers to an incremental framework for deploying data. Tim Berners-Lee, the inventor of the Web and initiator of the Linked Data project, suggested a 5 star deployment scheme for Linked Data. The 5 Star Linked Data system is cumulative. Each additional star presumes the data meets the criteria of the previous step(s).
☆ Data is available on the Web, in whatever format.
☆☆ Available as machine-readable structured data, (i.e., not a scanned image).
☆☆☆ Available in a non-proprietary format, (i.e, CSV, not Microsoft Excel).
☆☆☆☆ Published using open standards from the W3C (RDF and SPARQL).
☆☆☆☆☆ All of the above and links to other Linked Open Data.
An easy to read graphic for explaining the 5 Star Linked Data model may be seen on the 5 Star Linked Open Data mug. One reads both green labels for Linked Open Data, or neither green label for Linked Data. The 5 Open Data diagram is a graphical representation of each of the 5 Star techniques as described by Tim Berners-Lee.
2. Apache License
Apache License, version 2.0 is used for many Linked Data tools and projects. It is a popular Open Source license published by the Apache Software Foundation.
3. API
An Application Programming Interface (API) is an abstraction implemented in software that defines how others should make use of a software package such as a library or other reusable program. APIs are used to provide developers access to data and functionality from a given system.
4. Authoritative Open Data Authoritative open data refers to open data that conforms to Best Practices for Publishing Linked Data and which may be published by government and other responsible agencies. Government agencies are often in a unique position and able to collect data that no other entity can. Open government data is nearly always collected at tax-payers expense and is viewed by the public and government, as valuable if made available with proper context and an open license. Linked Data is seen by many to be a useful approach to publish and consume authoritative open data. An authoritative open data is often published by governments in linked data form and enjoys a greater chance of being discovered and re-used by others. See also [Government Open Data]. 5.Creative Commons Licenses
Creative Commons copyright licenses and tools aim to forgeA balance inside the traditional “all rights reserved” setting that copyright law creates.Creative Commons licenses and tools provide a simple, standardized way to grant copyright permissions to their creative work. The combination of our tools and our usersis a vast and growing digital commons, a pool of content that can be copied, distributed, edited, remixed, and built upon, all withinlegal statement by the boundariesowner of copyright law.in intellectual property specifically allowing people to use or redistribute the copyrighted work in accordance with conditions specified therein. See also About Creative Commons Licenses.
6.5. CC-BY-SA License
CC-BY-SA is a form of Creative Commons license for resources released online. Work available under a CC-BY-SA license means you can include it in any other work under the condition that you give proper attribution. If you create derivative works (such as modified or extended versions), then you must also license them as CC-BY-SA.
7.6. Closed World
Closed world is a concept from Artificial Intelligence and refers to a model of uncertainty that an agent assumes about the external world. In a closed world, the agent presumes that what is not known to be true must be false. This is a common assumption underlying relational databases, most forms of logical programming. See also Open World.
8.7. Connection
Connection is a concept from computer networking. It refers to a transport layer virtual circuit established between two programs for the purpose of communication.
10.9. Content Negotiation
Content negotiation, often called "conneg", refers to a phase in establishing a network connection. It is a mechanism for selectingIn the appropriate representation when servicing a request.HTTP Protocol, the representationuse of a message header to indicate which response formats a client will accept. Content negotiation allows HTTP servers to provide different versions of entitiesa resource representation in anyresponse can be negotiated (including error responses).to any given URI request. See also [HTTP Protocol 1.1]. See also Connection.
11. Controlled Vocabulary A10. Controlled Vocabulary
is aCarefully selected setsets of terms that can beare used to index, tag ordescribe units of information. By providing a restricted and managed set of terms they can beinformation; used to reduce ambiguity in information systems. Such vocabularies may be unstructured (e.g. code lists) or may be organized into increasingly complex knowledge organization schemes ( taxonomies , thesauri, ontologies ).create taxonomies, thesauri and ontologies. In traditional settings the terms in the controlled vocabularies are words or phrases, in a linked data setting then they are normally assigned unique identifiers (URIs) which in turn link to descriptive phrases.
12. Converter Converter refers to a tool or script that converts data from one form to another, e.g.,11. CSV
into RDF . Publishing good quality, useful LinkedA tabular data requires expressionformat in which columns of resources and how theyinformation are related. Linked Data modelers work with subject domain experts to make explicit the relationships between resources before converting a data set to RDF. 13. CSV A CSV (comma separated value) file is a plain text file usually generated from a spreadsheet or database dump. Each line or record contains fieldsseparated by a comma.comma characters. CSV files may or may not contain column header names that may provide some information about the data. Fromare a Linked Data perspective, CSV filesnon-proprietary format and are considered 3-star data on the 5-star scale.
14.12. cURL
cURL is a command line Open Source/Free Software client to retrieve any data over a wide variety of protocols,that can transfer data, including machine readable RDF.RDF, from or to a server using one of its many supported protocols.
13. CURIEs
CURIEs stands for "compact URI expressions" and is an RDFa approach for shortening URIs.
14. Fragment Identifier
The part of an HTTP URI that follows a hash symbol (‘#’). Fragment identifiers are not passed to Web servers by Web clients such as Web browsers.
15. Data Cloud
Data cloud, also called the Linked Data Cloud, is a visual representation of datasets published as Linked Data. Using metadata generated by directories, including CKAN, the project records datasets by domain. The Linked Data Cloud has doubled in size every 10 months since 2007 and as of late 2012 consists of more than 300 data sets from various domains, including geography, media, government and life sciences, according the [ State of the LOD Cloud ], website and visualizations maintained by C. Bizer, A. Jentzsch, R. Cyganiak. The original data owners/stewards publish one third of the data contained in the Linked Open Data Cloud, while third parties publish 67%.Many academic institutions republish data from their respective governments as Linked Data, often enhancing the representation in the process.
16. Data Hub, The
The Data Hub is a specific site offering a community-run catalogue of data sets of data on the Internet, powered by the open-source data portal platform CKAN. The Data Hub is an openly editable open data catalogue in the style of Wikipedia.
17. Data Market
A data market, also called a Data Marketplace, is an online (broker) service to enable discovery and access to a large collection of datasets offered by a range of data providers. Examples include Infochimps, Azure Marketplace and Factual. Data Markets may include open as well as paid-for data, and may offer value added services such as APIs and visualizations and programmatic data access.
18. Data Modeling
Data modeling is a process used to define and analyze data requirements for an information system. In the contextof Linkedorganizing data Modeling,and information describing it isinto a process that involves professional data modelers working closely with business stakeholders to define and document implicit and explicit relationships. Linkedfaithful representation of a specific domain of knowledge. Linked data modeling applies formal Linked Datamodeling techniques based on Linked Data Principles.
19. DatasetDataset, RDF
A collection of RDF data, publishedcomprising one or more RDF graphs that is published, maintained, or curatedaggregated by a single agent, and available for access or downloadprovider. In one or more formats.SPARQL, an RDF Dataset represents a collection of RDF graphs over which a query may be performed.
20. Data Warehouse
A data warehouse is one approach to data integration in which data from various operational data systems is extracted, cleaned, transformed and copied to a centralized repository. The centralized repository can then be used for data mining or answering analytical queries. By contrast, Linked Data assumes and accounts fora distributed approach of data management using HTTP URIs to describe and access information resources. A Linked Data approach is seen as an valid alternative to the centralized data warehouse approach especially when integrating open government datasets.datasets available on the public Web.
21. DBpedia
DBpedia is a community effort to extract structured information from Wikipedia and make it available on the Web. DBpedia is often depicted as a hub for the Data Cloud. An RDF representation of the metadata derived fromheld in Wikipedia isand made available for SPARQL queries and linking to other datasetsquery on the World Wide Web.
DBpedia also provides a human readable version of the structured content. For example, the human readable version of Linked Data for the color "Red" is found on DBpedia at http://dbpedia.org/page/Red . See also [ curl ].22. Dereferenceable URIs
When an HTTP client can look up a URI using the HTTP protocol and retrieve a description of the resource, it is called a dereferenceable URI. Per Linked Data Principles, we identify things using HTTP URIs and provide information about them when an HTTP URI is resolved or dereferenced.Dereferenceable URIs applies to URIs that are used to identify classic HTML documents and URIs that are used in the Linked Data context [COOL-SWURIS] to identify real-world objects and abstract concepts.
23. Description Logic
Description Logic (DL) is a family of knowledge representation languages with varying and adjustable expressivity. DL is used in artificial intelligence for formal reasoning on the concepts of an application domain. The Web Ontology Language (OWL) provides a standards-based way to exchange ontologies and includes a Description Logic semantics as well as an RDF based semantics. Biomedical informatics applications often use DL for codification of healthcare and life sciences knowledge.
24. DCAT
Data Catalog Vocabulary (DCAT) is an RDF vocabulary. It is designed to facilitate interoperability between data catalogs published on the Web. See also Data Catalog Vocabulary (DCAT).
26. Directed Graph
A directed graph is a graph in which the links between nodes are directional, i.e., they only go from one node to another. RDF represents things (nouns) and the relationships between them (verbs) in a directed graph. In RDF, links are labelled by being assigned unique URIs.
27. Document Type Definition
Document Type Definition (DTD) refers to a type of schema for defining a markup language, such as in XML or HTML (or their predecessor SGML).
28. Domain Name System (DNS)
Domain Name System (DNS) refers to the Internet's mechanism for mapping between a human-readable host name (e.g. www.example.com) and an Internet Protocol (IP) Address (e.g. 203.20.51.10).
namespaces 31.32. Entity
In the term entity refers to anything that can be named usingsense of an HTTP URL. It serves asentity-attribute-value model, an entity is synonymous with the Subject of a description. 32.an RDF Triple.
33. ETL
ETL is an abbreviation for extact,extract, transform, load. Linked Data modelers anddevelopers routinely extract data from a relational database, transform to a Linkeddata serialization,to RDF Triples, and thenload it into an RDF database.
33. Free/Libre/Open Source Software Free, also known as Libre or Open Source, is34. FOAF
See Friend of a generic and internationalized term forFriend.
35. Friend of a Friend
A Semantic Web vocabulary describing people and their relationships for use in resource descriptions. Commonly called "FOAF".
36. Free/Libre/Open Source Software
Free, also known as Libre or Open Source, is a generic and internationalized term for software released under an Open Source license.
34.37. Government Open Data
Many government authorities have mandated publication of data to the public Web. The broad intention is to facilitate the maintenance of open societies and support governmental accountability and transparency initiatives. However, publication of unstructured data on the World Wide Web is in itself insufficient; in orderTo realize the goals of improved efficiency, transparency and accountability, re-use of published data means members ofstructured content available on the public must be able to absorb data in ways that can be readily found and absorbed programmatically by machines, and visualizedWeb is enhanced by humans.following Linked Data Principles address many of the data modeling and format requirements to realize the goals of Open Government Data. 35. GraphPrinciples.
36. HyperText Markup Language (HTML)38. HyperText Markup Language
(HTML) isThe predominant markup language for hypertext pages on the Web. HTMLHyperText Markup Language (HTML) defines the structure of Web pages and itis a family of W3C standards.
37.39. HyperText Transfer Protocol
HyperText Transfer Protocol (HTTP) is the standard transmission protocol [RFC2616] used on the World Wide Web to transfer hypertext requests and information between Web servers and Web clients (such as browsers). It is an IETF standard.
38. International Standards Organization (ISO)41. International Standards Organization
(ISO) isA network of the national standards institutes of 162over 160 countries that cooperate to define international standards. It defines many standards including in the context formats for dates and currency. 39. Internet Engineering Task Force (IETF)Also called ISO.
42. Internet Engineering Task Force
(IETF) isAn open international community concerned with the evolution of Internet architecture and the operation of the Internet. It has defined standards such as HTTP and DNS. 40.Also called IETF.
43. Inference
Inference is the process of deriving logical conclusions from a set of starting assumptions. Using Linked Data, existing relationships are modeled as a set of (named) relationships between resources. Linked Data helps humans and machines to find new relationships through automatic procedures that generate new relationships based on the data and based on some additional information in the form of a vocabulary.
41.44. Internationalized Resource Identifier
A global identifier standardized by joint action of the World Wide Web Consortium and Internet Engineering Task Force. An IRI may or may not be resolvable on the Web. A generalization of URIs that allow characters from the Universal Character Set (Unicode). Slowly replacing URIs. See also Uniform Resource Identifier (URI), Uniform Resource Locator (URL). Also called IRI.
45. JSON
JavaScript Object Notation (JSON) is syntax for storing and exchanging text based information. JSON has proven to be a highly useful and popular object serialization and messaging format for the Web. See [RFC4627] for details.
42. JSON-LD46. JSON-LD
(JavaScriptJavaScript Object Notation for Linking Data)Data (JSON-LD) [JSON-LD] is an attempt to harmonize the representation of Linked Data in JSON. JSON-LD is a specification that outlinesa common JSON representationlanguage-independent data format for expressing directed graphs, mixing bothrepresenting Linked Data and non-Linked Data in a single document. JSON-LDData, based on JSON. JSON-LD is a lightweight Linked Data format that provides data context.capable of serializing any RDF graph or dataset and most, but not all, JSON-LD documents can be directly transformed to RDF. JSON-LD Syntax is easy for humans to read and write as well as, easy for machines to parse and generate. JSON-LD is based on the JSON format and provides a way to allow JSON data interoperate at Web-scale. JSON-LD isan appropriate Linked Data interchange language for JavaScript environments, Web service and NoSQL databases.
43. Jena Jena is an Open Source Software implementation of47. Linked Data
A pattern for hyperlinking machine-readable data sets to each other using Semantic Web development framework. It supportstechniques, especially via the storage, retrieval and analysisuse of RDF information. 44. Linked Data Linked data refers to a set of best practices for creating, publishingand announcing structured data onURIs. Enables distributed SPARQL queries of the Web. See [ Linkeddata Principles ]sets and a browsing or discovery approach to finding information (as compared to a search strategy). Linked Data is intended for access by both humans and machines. Linked Data is not the same as RDF , rather Linked Datauses the RDF family of standards for data interchange ((e.g., RDF/XML, RDFa, Turtle) and query (SPARQL). Linked Data can be published by an person or organization behind the firewall or on the public Web.If Linked Data is published on the public Web, it is generally called Linked Open Data. 45.See also [Linked Data Client A client side application that consumesPrinciples].
48. Linked Data using standard Web techniques.API
A LinkedREST API that allows data Client may resolve URI'spublishers to provide URLs to lists of things and clients to retrieve machine-readable data from those URLs.
49. Linked Data serializations, using appropriateclient
A Web client that supports HTTP content negotiation, and understands how to make usenegotiation for the retrieval of those representations once it receives them.Linked Data from URLs and/or SPARQL endpoints. A Linked Data client understands standard REST API, for example the Linked Data REST API. There are manyExamples of Linked Data clients, severalclients include: Tim Berners-Lee's early Tabulator browser, gFacet, and the Callimachus Shell (CaSH).
47. Linked Data Principles51. Linked Data Principles
Provide a common API for data on the Web which is more convenient than many separately and differently designed APIs published by individual data suppliers. Tim Berners-Lee, the inventor of the Web and initiator of the Linked Data project, proposed the following principles upon which Linked Data is based:
- Use URIs to name things;
- Use HTTP URIs so that things can be referred to and looked up ("dereferenced") by people and user agents;
- When someone looks up a URI, provide useful information, using the open Web standards such as RDF, SPARQL;
- Include links to other related things using their URIs when publishing on the Web.
48. Linked Open Data52. Linked Open Data
refers toLinked Data published on the public Web , often abbreviated as "LOD".and licensed under one of several open content licenses permitting reuse.
Publishing Linked Open Data enables distributed SPARQL queries of the data sets and a "browsing" or "discovery" approach to finding information, as compared to a search strategy. See also: [LD-FOR-DEVELOPERS], [HOWTO-LODP]
49.53. Linked Open Data Cloud
A colloquial phrase for the total collection of Linked Data Cloud represents interconnected datasets that have beenpublished as Linked Dataon the publicWeb. See also: Data Cloud, Linked Open Data Cloud diagram
50.54. Linked Open Data Cloud diagram
It refers to the pictoral depiction of the Linked Data Cloud . There are various depictions ofA diagram representing datasets published by the LinkedLinking Open Data Cloud including color-by-theme describingproject from 2007-2011. The various data domains including government, geographic, publications, life sciences and media content. 51. Linking Government Data Linking government data refersdiagram stopped being updated when individual datasets could no longer be meaningfully represented in a single diagram due to the usenumber of tools and techniquestotal datasets. See the pictoral depiction of the Semantic Web to connect, expose and use data from government systems. 52. Linking OpenLinked Data Project TheCloud.
53. Machine Readable Data56. Linkset
A collection of RDF links between two datasets.
57. Machine Readable Data
refers toData which canformats that may be seamlessly processedreadily parsed by programs. It often means non-graphics data which gets 2-stars on the 5-star Linked Data scale . While some open data developers use screen-scrapping techniquescomputer programs without access to parse machine readable content, using 4-star or 5-star Linked Data is preferable in terms of provenanceproprietary libraries. For example, CSV, TSV and ease of reuse. Anything less than 4-star data gives Web developers more work modelingRDF formats are machine readable, but PDF and transforming data. ByMicrosoft Excel are not. Creating and publishing data following Linked Data, you are increasing the ability ofData principles helps search engines,engines and thus humans,humans to find, access and re-use information. To see how a Linked Data representation yields bothdata. Once information is found, computer programs can re-use data without the need for custom scripts to manipulate the content.
Publishing machine readable data using Linked Data principles provides a human and machine readable version simultaneously, try this exercise.version. For example, Wikipedia has an interestingincludes a Web page about the color Red. DBpedia allows you to getDBpedia, the database containing structured content listed on the Wikipedia page forcontained in Wikipedia, allows a Linked Data client to look up "Red" [http://wikipedia.org/wiki/Red] by changing "wiki" to "data" and appending the appropriate file extension.
$ curl -L http://dbpedia.org/data/Red.ttl
Thus, you've seen how the same data can be viewed in human and machine readable format from the same page. 54.58. Message
The basic unit of HTTP communication. It consists of a structured sequence of octets matching the syntax defined as an HTTP Message and transmitted via the connection.
57.61. Modeling Process
Modeling process in the context of RDF refers to the act by subject matter experts to work with developers to capture the context of data and define the relationships of the data. By doing so, high quality of Linked Data is obtained since capturing organizational knowledge about the meaning of the data within the RDF data model means the data is more likely to be reused correctly. Well defined context ensures better understanding, proper reuse, and is critical when establishing linkages to other data sets.
58.62. N3
N3 is an abbreviation for Notation3. It has a readable RDF syntax used for expressing assertion and logic. N3 [N3] is a superset of RDF, extending the RDF model by adding formulae (literals which are graphs themselves), variables, logical implication, and functional predicates. See also [Turtle].
59.63. Namespace IRI
A namespace refersIRI is a base IRI shared by all terms in a given vocabulary or ontology.
64. Natural Keys
Human-readable categories and sub-identifiers within a URI that reflect what the identifier describes. They are recommended when creating URIs so that people reading RDF in its source format (mostly developers) will be able to more quickly understand it.
65. Neutral URI
A container forURI that avoids the exposure of implementation details within the URI itself.
66. N-Triples
A setsubset of namesTurtle that belongdefines a line-based format to encode a single authority. Namespaces allow different agents to use the same word in different ways. 60.RDF graph. Used primarily as an exchange format for RDF data. See also N-triples
67. Object
In the context of RDF, the object is the thirdfinal part of an RDF statement. It is the property value that is mapped to a subject by the predicate.See also [Subject] [Predicate]
61. Ontology An68. Ontology
isA formal model ofthat allows knowledge to be represented for a specific domain. ItAn ontology describes the types of things that exist (classes), the relationships between them (properties) and the logical ways those classes and properties can be used together (axioms).
69. Open Government Data
Refers to content that is published on the OWL (Web Ontology Language) family of languages providepublic Web by government authorities in a standardized-means for expressing and exchanging ontologies. It builds upon, andvariety of non-proprietary formats.
70. ORG Ontology
ORG is compatible with, RDF . 62. Ontology Matching It is a process of finding correspondences between semantically related entities of the ontologies, which can be used for various tasks, such as ontology merging, reconciliation, query answering, data translation, or for navigation on the semantic web. 63. Open Government Data Open government data broadly refers to content that is published on the public Web by government authorities at national, regional or local levels, in a variety of non-proprietary formats including as XML, CSV, SHP and PDF. 64. Open World Open world is a concept from Artificial Intelligence (AI) and refers to a model of uncertainty that an agent assumes about the external world. In an open world, the agent presumes that what is not known to be true may yet be true if additional information is later obtained. It is the assumption underlying RDF and OWL Full, and often opposed to "Closed World". 65. ORG Ontology ORG is anan RDF vocabulary to enable publication of information about organizations and organizational structures, even at governmental level. See also [http://www.w3.org/TR/vocab-org/]
66.71. Persistent Identifier Scheme
A persistent identifier scheme is a mechanmism for resolution of virtual resources. Persistent Uniform Resource Locator (PURLs) implement one form of persistent identifier for virtual resources. PURLs are valid URLs and their components must map to the URL specification. The scheme part tells a computer program, such as a Web browser, which protocol to use when resolving the address. The scheme used for PURLs is generally HTTP. Other persistent identifier schemes include Digital Object Identifiers (DOIs), Life Sciences Identifiers (LSIDs) and INFO URIs. All persistent identification schemes provide unique identifiers for (possibly changing) virtual resources, but not all schemes provide curation opportunities.
A PURL might redirect to the new location or return content proxied from the new location. 68. Predicate The73. Predicate
isThe second part ofmiddle term (the linkage, or "verb") in an RDF statement and gives the property which connects the subject of the statement to the object of thestatement. ThusFor example, in the informalstatement "Alice knows Bob" then "knows" is the predicate which connects "Alice" (the subject of the statement) to "Bob" (the object of the statement).
The term predicate derives from predicate calculus. In RDF we use the terms predicate (for the role) and property (for the thing that plays that role) regardless of whether the value of the property is a simple literal or some other resource. 69. Provenance Provenance refers to the sources of information, such as entities and processes, involved in producing or delivering an artifact. An ontology expressing the74. Provenance
Data model in OWL2 [ PROV-O ] is usedrelated to representwhere, when and interchange provenancehow information generated in different systems and under different contexts. 70.was acquired.
75. Protocol
Protocol, in the context of computing, refers toA set of instructions for transferring data from one computer to another over a network. A protocol standard defines both message formats and the rules for sending and receiving those messages. 71. Public Sector Information Public Sector Information, also called public information, is the information created by a government in the courseOne of governing. Inthe most democracies, such information can be made available to people in due course of time. 72.common Internet protocols is the Hypertext Transfer Protocol (HTTP).
73. Quad Store77. Quad Store
isA colloquial phrase for an RDF database that stores RDF triples plus an additional element of information, often used to collect statements into groups.
This notion has been clarified and standardized in SPARQL in the form of RDF Datasets 74. Query A78. Query
in the context of Linked Data impliesProgrammatic retrieval of resources and their relationships from the Web of Data.relationships. Using the SPARQL language, developers issue queries based on (triple) patterns.
SPARQL queries provide one or more patterns against such relationships. To get results, the query engine retrieves a response matching the requested query, returning a query result set. Results may be returned in a table format for example, which can be used to build complex mashups and visualizations. 75.79. R2RML
R2RML (RDB to RDF Mapping Language) is a language for expressing customized mappings from relational databases to RDF datasets. Such mappings provide the ability to view existing relational data in the RDF data model, expressed in a structure and target vocabulary of the mapping author's choice.
76.80. Raw Data
Raw data refers to machine-readable files from the wilderness released without any specific effort to make them applicable to a particular application. The advantage of "raw"Raw data is that it can be reused in multiple applications created by multiple communities; but thistypically requires some means of processing it. 77.additional scripts or programs to process the data.
79. Resource83. RDF database
A resource is anything that cantype of database designed specifically to store and retrieve RDF information. May be addressed byimplemented as a Unified Resource Identifier (URI) . 80. Resource Description Framework Resource Description Framework (RDF), istriple store, quad store or other type.
84. RDF-JSON
Confirm rewording acceptable to DaveR.
A general-purpose language for representing information in the Web. RDF provides a common model for Linked Data and is well suited for the representation of data on the Web. RDF is not a data format, rather a model for expressing relationships between arbitrary data elements that may be represented in a variety of standard formats. RDF is based on the idea of identifying things using Web identifiers or HTTP URIs , and describing resources in terms ofsimple properties and property values. A single RDF statement describes two things and a relationship between them. This enables RDF to represent simple statements about resources as a graph of nodes and arcs representing the resources, and their properties and values. Linked Data developers call the three elements in an RDF statement the subject , the predicate and the object . See also [ RDF 1.1 Concepts and Abstract Syntax 81. Resource Description Framework in Attributes Resource Description Framework in Attributes (RDFa) is a key enabling technology to add structured data to HTML pages directly. RDFa is a technique that provides a set of markup attributes to augment the visual information on the Web with machine-readable hints. [ RDFa-PRIMER ] 82. RDF Database A typeserialization of database designed specifically to store and retrieveRDF information. 83. RDF-JSON In favor of removing this item: DaveR.Triples in favor of keeping: Bhyland. To discussJSON. A concrete syntax in JavaScript Object Notation (JSON) [RFC4627] for RDF as defined in the RDF Concepts and Abstract Syntax [RDF-CONCEPTS] W3C Recommendation. An RDF-JSON document serializes such a set of RDF triples as a series of nested data structures.See also [ RDF 1.1 JSON Serialization document draft ] 84.JSON-LD.
85. RDF Schema
The simplest RDF Schema (RDFS)vocabulary description language. It provides much less descriptive capability than the Simple Knowledge Organization System (SKOS) or the Web Ontology Language (OWL). A standard of the W3C [RDFS]
is86. RDF/XML
An RDF syntax encoded in XML. A standard of the vocabulary languageW3C. [[!RDFXML]
87. Representational State Transfer (REST)
An architectural style for RDF;information systems used on the Web. It describes constructs for typesexplains some of objects (Classes), relating typesthe Web's key features, such as extreme scalability and robustness to one another (subClasses), properties that describe objects (Properties),change. REST is the foundation of the World Wide Web and relationships between them (subProperty). 85.the dominant Web service design model.
88. Request
Request refers to a stage in the HTTP protocol. A request message from a client to a server includes, within the first line of that message, the method to be applied to the resource, the identifier of the resource, and the protocol version in use.
See also RFC 2616bis for89. Resource
In an HTTP Request. 86.RDF context, a resource can be anything that an RDF graph describes. A resource can be addressed by a Unified Resource Identifier (URI).
90. Resource Description Framework (RDF)
A family of international standards for data interchange on the Web produced by W3C. Resource Description Framework (RDF) is a network data objectbased on the idea of identifying things using Web identifiers or service that can be identified by anHTTP URI.URIs, and describing resources may be availablein multiple representations (e.g. multiple languages, data formats, size,terms of simple properties and resolutions) or vary in other ways.property values. See details from RFC 2616bis for details on Uniformalso [RDF 1.1 Concepts and Abstract Syntax
91. Resource Identifiers. See details from RFC 2616bis for detailsDescription Framework in attributes (RDFa)
An RDF syntax encoded in HTML documents. RDFa provides a set of markup attributes to augment the visual information on Uniform Resource Identifiers . 87. Responsethe Web with machine-readable hints. It is a standard of the World Wide Web Consortium. [RDFa-PRIMER]
92. Response
refers toAn action by a stage inserver taken as the HTTP protocol. After receiving and interpretingresult of a request message,by a client. In HTTP, a server responds with an HTTPresponse message.provides a resource representation to the calling client. See also [RFC2616] bis for an HTTP Response message.
94. REST is the foundation of the World Wide WebAPI
An API implemented using HTTP and the dominant Web service design model. 89. REST APIprinciples of REST API refersto an application program interface basedallow actions on Web resources. The most common actions are to create, retrieve, update and delete resources. See also Representational State Transfer ( REST ). 90..
95. Schema
Schema refers to a data model that represents the relationships between a set of concepts. Some types of schemas include relational database schemas (which define how data is stored and retrieved), taxonomies and ontologies.
91.96. Semantic Technologies
The broad set of technologies that related to the extraction, representation, storage, retrieval and analysis of machine-readable information.
The Semantic Web standards are a subset of semantic technologies and techniques. 92.97. Semantic Web
An evolution or part of the World Wide Web that consists of machine-readable data in RDF and an ability to query that information in standard ways (e.g. via SPARQL)
93. Semantic Web Stack The Semantic Web Stack is a layered representation of the architecture of the semantic web, constituting of standardized languages and technologies. 94.98. Semantic Web Standards
Standards of the World Wide Web Consortium ( W3C )relating to the Semantic Web, including RDF [RDF], RDFa [RDFa-PRIMER], SKOS [SKOS-REFERENCE], OWL [OWL2] and SPARQL [SPARQL].
95.99. Semantic Web Search Engine
A search engine capable of making use of semantic technologies bothto model its knowledge base and the content delivered to the users. 96. Service Oriented Architecture (SOA) Service Oriented Architecture (SOA) refers to a set of architectural design guidelines used to expose IT systems as services. The functionality of a service is publishedto a registry, can be discovered by potential new users, and directly invoked using published standards. A Web Services based system is an example of SOA. 97.deliver content.
100. Sesame
Sesame is an Open Source Software implementation of a Semantic Web development framework. It supports the storage, retrieval and analysis of RDF information. See also [Open RDF].
98.101. Skolemization
Skolemization is a process whereby some RDF databases and other systems implementing the SPARQL query language automatically assign URIs to blank nodes so that they are more easily operated upon.
102. Simple Knowledge Organisation System
Simple Knowledge Organisation System (SKOS) [SKOS-REFERENCE] is a vocabulary description language for RDF designed for representing traditional knowledge organization systems such as enterprise taxonomies in RDF.
99.103. Sindice
Sindice is a search engine for Linked Data. It offers search and querying capabilities across the data it knows about, as well as specialized APIs and tools for presenting Linked Data summaries. See also http://sindice.com.
104. SPARQL
Simple Protocol and RDF Query Language (SPARQL) defines a standardquery language and data access protocolfor use with theRDF [ SPARQL ]. Just as SQL is used to query relationaldata, SPARQL is usedanalogous to query an RDF database. SPARQL 1.1 [ SPARQL-11 ] specification allows more set of operations and queries on a RDF graph content onthe Web or in a RDF store. 100. SPARQL client A SPARQL client is an application that can construct and issueStructured Query Language (SQL) for relational databases. A SPARQL query. An examplefamily of a SPARQL client is ARQ , partstandards of the Apache Jena Project. ARQ is a query engine for Jena that supports the SPARQL RDF Query Language. 101.World Wide Web Consortium. See also http://www.w3.org/TR/sparql11-overview/.
105. SPARQL endpoint
A service that accepts SPARQL endpoint refersqueries and returns answers to an application that can answer athem as SPARQL query, including one where the native encoding of information is not in RDF.result sets. It is a best practice for datasets providers to give the URL of their SPARQL endpoint to getallow access to their data bothprogrammatically or through thea Web interface. A list of some endpoints status is available at http://labs.mondeca.com/sparqlEndpointsStatus/ 102. Structured Query Language (SQL) Structured Query Language (SQL) is a query language standard for relational databases. 103.at http://labs.mondeca.com/sparqlEndpointsStatus/
106. Subject
The subject is the first part of an RDF statement. A subject in the context of a triple <?s ?p ?o> refers to who or what the RDF statement is about.
104.107. Taxonomy
Taxonomy is a formal representation of relationships between items in a hierarchical structure. Also see Ontology.
105. Term A108. Term
isAn entry in a controlled vocabulary, schema, Taxonomy or Ontology.
106. Triple A109. Triple
refers to aAn RDF statement, consisting of two things (a "subject" and an "object") and a relationship between them ("Predicate").(a verb, or "predicate"). This subject-predicate-object triple forms the smallest possible RDF graph (although most RDF graphs consist of many such statements).
107. Triple store110. Triple store
isA colloquial phrase for an RDF database that stores RDF triples.
108. Tuple111. Tuple
is a mathematical term referring to an ordered list of elements.RDF statements are 3-tuples; an ordered list of three elements.
109. Turtle112. Turtle
isAn RDF serialization format designed to be easier to read than others such as RDF/XML. The term "Turtle" was derived from Terse RDF Triple Language. Turtle allows an RDF graph to be written in a compact and natural text form, with abbreviations for common usage patterns and datatypes. Turtle [TURTLE-TR] provides levels of compatibility with the existing N-Triples formatformat, as well as the triple pattern syntax of the SPARQL W3C Recommendation.
118. Vocabulary
A vocabulary defines the concepts and relationships (also referred to ascollection of "terms" ) to describe and represent a given topic. A vocabulary is used to classify the terms that are usedfor a particular application, characterize relationships, and define possible constraints on the use of the terms.purpose. Vocabularies can range from simple such as the widely used RDF Schema, FOAF and Dublin Core VocabularyMetadata Element Set to the verycomplex vocabularies with thousands of terms, such as those used in healthcare to describe symptoms, diseases and treatments. Vocabularies play a very important role in Linked Data, specifically to help with data integration. The use of this term overlaps with Ontology.
119. Vocabulary alignment
The process of analyzing multiple vocabularies also helpto organize knowledge anddetermine terms that are extensively used by libraries, museums, newspaperscommon across them and government agencies that manage large collections of data. 116. VoIDto record those relationships.
120. VoID
isVocabulary of Interlinked Datasets, an RDF Schema vocabulary for expressing metadata about RDF datasets. Itdatasets and a standard of the World Wide Web Consortium. VoID is intended as a bridge between the publishers and users of RDF data, with applications ranging from data discovery to cataloging and archiving of datasets. VoID can be used to express general metadata based on Dublin Core, access metadata, structural metadata, and links between datasets. [VOID-GUIDE]
117.121. Web 2.0
Web 2.0 is a colloquial description of the part of the World Wide Web that implements social networking, blogs, user comments and ratings and related human-centered activities.
118.122. Web 3.0
Web 3.0 is a colloquial description of the part of the World Wide Web that implements machine-readable data and the ability to perform distributed queries and analysis on that data. It is considered synonymous with the phrases "Semantic Web" and "The Web of Data".
119. Web of Data123. Web of Data
is a phrase to describe publishing data sets usingA subset of the World Wide Web which contains Linked Data Principles thereby makingData.
124. Web of Documents
The original, or traditional, World Wide Web into a global database. 120.in which published resources were nearly always documents.
125. Web Ontology Language (OWL)
OWL is a family of knowledge representation and vocabulary description languages for authoring ontologies, based on RDF and standardized by the W3C [OWL2].
Standardized variants include OWL Full, OWL DL (for "description logic") and OWL Lite. 121.126. World Wide Web Consortium (W3C)
World Wide Web Consortium , also known as W3C , isAn international community that develops and promotes protocols and guidelines that ensurefor the long-term growth for the Web. W3C's standards define key parts of what makesthe World Wide Web wrok. It defined standards such asWeb, including Web Design, Web Architecture and the Semantic Web. See also W3C Mission.
122.127. eXtensible Hypertext Markup Language (XHTML)
eXtensible Hypertext Markup Language (XHTML) is a family of versions of HTML based on XML and standardized by the W3C [XHTML1].
123.128. eXtensible Markup Language (XML)
XML [XML] is a specification for creating structured textual computer documents, subset of SGML enabling such documents to be served, received and process on the Web in the same way as HTML documents . There are many thousands of XML formats, including XHTML. It is part of a family of standards from the W3C.
124.129. XML Schema
XML Schemas provide a means for defining the structure, content and semantics of XML documents as defined in [XMLS-SCHEMA0].
A. Acknowledgments
The editors are grateful to David Wood for contributing the initial glossary terms from Linking Government Data, ( Springer 2011 ).(Springer 2011). The editors wish to also thank the activemembers of the Government Linked Data Working Group with special thanks to the reviewers and contributors: Thomas Baker, Hadley Beeman, Richard Cyganiak, Michael Hausenblas, Benedikt Kaempgen, James McKinney, Marios Meimaris, Jindrich Mynarz, Michael Pendleton and Dave Reynolds who all worked hard to iterate thisdiligently iterated the W3C Linked Data Glossary so that everyone has some common groundin order to create a foundation of terms upon which to growdiscuss and better describe the Web of Data. Mille grazie!Thank you!
B. References
B.1 Normative references
- [OWL2]
- OWL 2 Web Ontology Language Document Overview, W3C OWL Working Group, 27 October 2009. W3C Recommendation. URL: http://www.w3.org/TR/owl2-overview/
-
[PROV-O] PROV-O: The PROV Ontology , Timothy Lebo, Sathya Sahoo, Deborah McGuinness (eds), 11 December 2012, W3C Candidate Recommendation. URL: http://www.w3.org/TR/prov-o/[RDF] - RDF/XML Syntax Specification (Revised), Dave Beckett (eds), 10 February 2004, W3C Recommendation. URL: http://www.w3.org/TR/REC-rdf-syntax/
- [RDF-CONCEPTS]
- Resource Description Framework (RDF): Concepts and Abstract Syntax, Graham Klyne, Jeremy J. Carroll, 10 February 2004. W3C Recommendation. URL: http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/
- [RDFS]
- RDF Vocabulary Description Language 1.0: RDF Schema,ed. Dan Brickley, R.V. Guha, 10 February 2004. W3C Recommendation. URL: http://www.w3.org/TR/rdf-schema/
- [RDFa-PRIMER]
- RDFa Primer, Ben Adida, Ivan Herman, Manu Sporny, 07 June 2012. W3C Note. URL: http://www.w3.org/TR/2012/NOTE-rdfa-primer-20120607/
- [RFC2616]
- Hypertext Transfer Protocol -- HTTP/1.1, R. Fielding; et al. June 1999. Internet RFC 2616. URL: http://www.w3.org/Protocols/rfc2616/rfc2616.html.
- [RFC3986]
- Uniform Resource Identifier (URI): Generic Syntax, Berners-Lee, et al. January 2005. Internet RFC 3986. URL: http://tools.ietf.org/html/rfc3986.
- [RFC4627]
- The application/json Media Type for JavaScript Object Notation (JSON), D. Crockford, July 2006. Network Working Group. URL: http://www.ietf.org/rfc/rfc4627.txt
- [SKOS-REFERENCE]
- SKOS: Simple Knowledge Organization System Reference, Sean Bechhofer, Alistair Miles (eds), 18 August 2009, W3C Recommendation. URL: http://www.w3.org/TR/2009/REC-skos-reference-20090818/
- [SPARQL]
- SPARQL Query Language for RDF,Eric Prud'hommeaux, Andy Seaborne, 15 January 2008. W3C Recommendation. URL: http://www.w3.org/TR/rdf-sparql-query/
-
[SPARQL-11] SPARQL 1.1 Overview ,The W3C SPARQL Working Group, 8 November 2012. W3C Proposed Recommendation. URL: http://www.w3.org/TR/2012/PR-sparql11-overview-20121108/[TURTLE-TR] - Turtle: Terse RDF Triple Language,Eric Prud'hommeaux, Gavin Carothers, 19 February 2013. W3C Candidate Recommendation. URL: http://www.w3.org/TR/2013/CR-turtle-20130219/
- [XHTML1]
- XHTML 1.0 The Extensible HyperText Markup Language (Second Edition), Steven Pemberton, Daniel Auster, et al., 26 January 2000. W3C Recommendation. URL: http://www.w3.org/TR/xhtml1/
- [XML]
- Extensible Markup Language (XML) 1.0 (Fifth Edition), Tim Bray, Jean Paoli, C.M. Sperberg-McQueen, Eve Maler, François Yergeau, 26 November 2008. W3C Recommendation. URL: http://www.w3.org/TR/REC-xml/
- [XMLS-SCHEMA0]
- XML Schema Part 0: Primer Second Edition, David C. Fallside, Priscilla Walmsley (eds), 28 October 2004, W3C Recommendation. URL: http://www.w3.org/TR/xmlschema-0/