URI Schemes in XML Namespace Names

Author: Tim Bray

Abstract

This document discusses issues regarding the selection of URI Schemes in URIs used as XML Namespace Names, with particular attention to the urn: and http: schemes.

Table of Contents

1. Introduction

The Namespaces in XML Recommendation prescribes that:

Subsequent discussion has established a consensus that absolute, not relative, URIs are appropriate for use in namespace names.

In practice, two URI schemes are overwhelmingly selected for use in namespace names: http: and urn:. For convenience, this document refers to these two choices respectively as "URLs" and "URNs."

2. Stringiness

In practice, in implemented software, the most common actual use of namespace names, and the only one actually blessed by Namespaces in XML, is simply as character strings, to help in identifying and disambiguating identifiers.

Consider two URIs which could be selected for use as a namespace name: http://example.com/ns/a38 and urn:example:ns:a38. In terms of the intended and most common use of namespace names, there is exactly zero difference between the two of these; neither is any better nor any worse than the other.

3. Stability and Persistence

It is an explicit design goal of URNs that they offer persistence; this is supported by the URN namespace registration scheme. This is said to make URNs suitable for use as identifiers, as distinguished from locators. This would seem to be highly consistent with the goal that namespace names exhibit persistence.

On the other hands, treated purely as strings, all URIs have equal persistence: assuming that a normative specification says some URI identifies a namespace, and that deployed software implements that identification function, it makes no difference whether the the URI is a URN or URL.

Persistence and URLs

It is clear that the resource representations accessed by dereferencing a URL is subject to change both de jure and de facto. If the correct functioning of namespace names depends on these representations, then persistence is impaired. However, the specified use of namespace names does not involve any use of resource representations.

Given the DNS-rootedness of URLs, it is always possible for the authority over a namespace name to change. It could be seen as impairing the persistence of namespace names which are URLs rooted at http://www.w3.org, if at some point the domain w3.org become owned by the Waxen Wolf & Weasel Corporation.

On the other hand, the inclusion of the year of publication in namespace names (as is now practiced by the W3C) decreases the likelihood that at some future point Waxen Wolf & Weasel will wish to re-use that URL.

And as previously noted, the correct function of a namespace name as an identifying character string would not in the slightest be impaired in the Waxen Wolf & Weasel scenario.

4. Use of Representations

While the content of resource representations does not and cannot serve any role in the primary identification function of namespace names, it may in practical terms increase the usefulness of namespaces.

The most obvious use of such a represenation would would be for documentation purposes; any widely-used namespace will have human-readable documentation likely including its normative specification as well as other reference and tutorial material.

A somewhat less obvious use of resource representations would be to aid in automated discovery of machine-usable resources that are related to some namespace. Examples would include schemas, stylesheets, and executable code. RDDL is a proposal for combining the human-documentation and related-resource-lookup functions.

It should be noted that any such usage of resource represenations is ancillary to the defined function of namespace names, and any system which depends on such retrieval for its function is operating outside the scope of specified interoperable behavior.

In scenarios where the discovery of human- or machine-readable material related to a namespace is apt to be important or even useful, URLs offer advantages over URIs in that they come equipped with a well-defined network protocol to achieve this, which is widely implemented and available on essentially every network-connected computer in the world.

While it is possible in principle to use a URN to retrieve network resources, the software and tools for doing this are not widely distributed, and and this is not really competitive with the ubiquity and straightforwardness of retrieval by URL.

5. Registration of URNs

A point which may be significant in some contexts is that URNs are not entirely free. A URN must be either be in an existing URN namespace or a new URN namespace must be registered for the purpose. The registration requirements for new namespaces are a matter of record and there is a body of experience as the requirements and costs.

Existing URN namespaces may have their own registration procedures of varying costs and complexity.

Acknowledgements

Acknowledgement is due to the participants in the www-tag mailing list, in particular Paul Prescod, for clarifying comments on this subject.