URN functional spec

Karen R. Sollins (sollins@lcs.mit.edu)
Tue, 8 Feb 94 19:29:40 -0500

Date: Tue, 8 Feb 94 19:29:40 -0500
Message-Id: <9402090029.AA10197@zippy.lcs.mit.edu>
From: Karen R. Sollins <sollins@lcs.mit.edu>
To: uri@bunyip.com
Subject: URN functional spec

Attached is the revised URN functional spec, taking into consideration
the comments we received back last time around. Please send comments
and any further discussion soon, so that we can get this moving out of
our working group.
Thanks,
Karen
PS: If you want other versions of this (with all the fancy fonts and
formatting) they are available by anonymous ftp in
ana.lcs.mit.edu:pub/uri/urn-func-spec.xxx

=================

Specification of Uniform Resource Names

Karen R. Sollins and Larry Masinter
February 8, 1994

DRAFT

Presented here are the requirements and functional specification for
_uniform resource names_ (URNs) within the overall architecture of
_Uniform Resource Identification_. In order to build applications
in the most general case, the user must be able to discover and
identify the information, objects, or what we will call in this
architecture resources, on which the application is to operate. As
the network and interconnectivity grows, the ability to make use of
remote, perhaps independently managed, resources will become more and
more important. This activity of discovering and utilizing resources
can be broken down into those activities where one of the primary
constraints is human utility and facility and those in which human
involvement is small or nonexistent. Human naming must have such
characteristics as being both mnemonic and short. Humans, in contrast
with computers, are good at heuristic disambiguation and wide
variability in structure. In order for computer and network based
systems to support global naming and access to resources that have
perhaps an indeterminate lifetime, the flexibility and attendent
unreliability of human-friendly names should be translated into a
naming infrastruture more appropriate for the underlying support
system. It is this underlying support system that the Internet
Information Infrastructure Architecture (IIIA) is addressing.

Within the IIIA, several sorts of information about resources are
specified and divided among different sorts structures, along
functional lines. In order to access information, one must be able to
discover or identify the particular information desired, determined
both how and where it might be used or accessed. The partitioning of
the functionality in this architecture is into _uniform resource names_
(URN), _uniform resource characteristics_ (URC), and _uniform resource
locators_ (URL). A URN identifies a resource or unit of information.
It may identify, for example, intellectual content, a particular
presentation of intellectual content, or whatever a name assignment
authority determines is a uniquely namable entity. A URL identifies
the location or a container for an instance of a resource identified
by a URN. The resource identified by a URN may reside in one or more
locations at any given time, and may move. Of course, not all
resources will move during their lifetimes, and not all resources,
although identifiable and identified by a URN will be instantiated at
any given time. As such a URL is identifying a place where a resouresce
may reside, or a container, as distinct from the resource itself
identified by the URN. A URC is a set of meta-level information about a
resource. Examples of such meta-information are: owner, encoding,
access restrictions (perhaps for particular instances), and location.
With this in mind, we can make the following statement:

The purpose or function of a URN is to provide a globally unique,
persistent identifier used both for recognition and often for
access to characteristics of or access to the resource.

There are two kinds of requirements on URNs: requirement on the
functional capabilities of URNs, and requirements on the encoding of
URNs. Specifically, this leads to the following list of requirements
for URNs functional capabilities:

* Global scope: A URN is a name with global scope which does not
imply a location. It has the same meaning everywhere.

* Global uniqueness: The same name will never be assigned to two
different resources, no matter how separated the objects are.

* Persistence: It is intended that the lifetime of a URN be
permanent. That is, the URN will be globally unique forever, and
may well be used as a reference to a resource well beyond the
lifetime of the resource it identifies or of any naming authority
involved in the assignment of its name.

* Sameness: There exists a mechanism for asking whether two resources
are the same, based on their URNs without going to the naming
authority. This is a simple equality test on the URNs, which
requires a canonicalization of the strings.

* Scalability: URNs can be assigned to any resource that might
conceivably be available on the network, for hundreds of years.

* Grandfathering: The scheme must permit the grandfathering of
existing systems. For example, ISBN numbers, ISO public
identifiers, UPC product codes and the like are naming schemes
which should be allowed to be embedded within the URN system.

* Extensibility: Any scheme for URNs must permit future extensions to
the scheme.

* Caching: Any scheme for which caching is appropriate should not
impede caching of resources named by it.

* Independence: It is solely the responsibility of a name issuing
authority to determine the conditions under which it will issue a
name.

* Resolution: A URN will not impede resolution (translation into a
URL, q.v.). To be more specific, for URNs that have corresponding
URLs, there must be some feasible mechanism to translate a URN to a
URL.

In addition, the following are requirements for URNs as they are
encoded:

* Human transcribability: For a URN to be human transcribable it must
not be too long, but may be in a limited alphabet. It is intended
that a URN can be typed on a keyboard, although explicit semantics
accessible to the human are discouraged.

* Transport friendliness: A URN can be transported unmodified in
the common Internet protocols, such as TCP, SMTP, FTP, Telnet, etc.,
as well as printed paper.

* Machine consumption: A URN can be parsed by a computer.

The URN specification is a description of a scheme that meets these
requirements. Some design decisions are a direct result of these
requirements:

* To satisfy the requirements of uniqueness and
scalability, name assignment is delegated to naming authorities,
who may then assign names directly or delegate that authority to
sub-authorities. Uniqueness is guaranteed by requiring each naming
authority to guarantee uniqueness. The name of the naming
authorities themselves are persistent and globally unique, by means
of a central registry.

* Naming authorities that support scalable naming are encouraged, but
not required. Scalable naming is allowable. Scalability implies that
a scheme for devising names may be scalable both at its terminators as
well as within the structure.

* The naming authority should be discouraged from but is permitted to
design a naming scheme which is not scalable.

* Naming authorities should guarantee that somewhere, somehow, there
is a mapping (or the potential for a mapping) to one or more URLs.
The naming authority itself need not provide the mapping from URN
to URL.

* For URNs to be transcribable and transported in mail, it is necessary
to limit the character set usable in URNs, although there is not yet
consensus on what the limit might be.

In assigning names, a name asignment authority must abide by the
preceding constraints, as well as defining its own criteria for
determining the necessity or indication of a new name assignment.