13 April 2005

HTTP URIs versus URNs

Dave Orchard has an interesting discussion of the advantages of using straight forward HTTP URIs (an actual web reference) over the idealistic URNs (abstract notations for a resource that may map to a HTTP URI) Dave Orchard's Blog: Why HTTP uris are better than urns and even id: uris for identifiers


Why HTTP uris are better than urns and even id: uris for identifiers

When creating a URI based identifer, perhaps the most important decision is which uri scheme to use. Two of the most common schemes are http: and urn: schemes. A common reason given for using URNs for identifiers, such as namespace names, is that an http: identifier appears to humans as a location and hence dereferencable. Another common reason is to come up with an identifier that is location-independent or that is "movable" from one location to another.

URIs have context
The first argument, that http: uris are "locations", is based upon incomplete understanding of the use of URIs. Any data type exists in a context, in this case URIs. The context will define the use of a URI, and includes social and technical context. A URI on the side of a van will convey the social meaning that it can be typed into a browser and some good stuff will show up in the window. Other contexts for the use of URIs include namespace names, references to documents, and identifiers for *things*. There is never the case that a URI is simply "found" without a context. The key point is that every use of a URI for an identifier has a context.

The use of uris in namespace names is enlightening. Imagine 2 scenarios, one with a urn and another with an http: uri. The namespace specification defines a context, which roughly speaking says that a namespace name SHOULD not be considered dereferenceable. Any software component that is written assuming that a namespace name MUST be dereferencable is violating the namespace specification, ie the context. It may be that the namespace owner has guaranteed that they will provide a document at the namespace name, but this must be on a subset of the entire set of namespace names. Clearly generic XML software should not be written to assume dereferencability of namespace names.

It is natural for a human reading an xml document with a namespace name that they do not know about to want to understand more about the namespace. This is why the TAG recommends providing a document at a namespace name that provides both human and machine readable information.


Posted by mofoghlu at April 13, 2005 10:07 AM | TrackBack