To: uri@bunyip.com
Subject: DRAFT urn requirements section
From: Larry Masinter <masinter@parc.xerox.com>
Message-Id: <93Nov29.112216pst.2732@golden.parc.xerox.com>
Date: Mon, 29 Nov 1993 11:22:13 PST
This is a draft of the URN requirement section that Karen Sollins and
I are drafting. My apologies for the delay on getting this out; all
of the fault is mine. Karen was quite prompt in writing the first
draft of this, and I've been slow at making what were supposed to be
minor revisions.
If you note typos or have questions about wording, I suggest you send
them just to the two of us and not the whole list. Some of the issues
are unresolved, however, and it would be good to have some open
discussion of them. These are indicated with << these kinds of
brackets >>.
I'll try to capture comments and controversies and fold them back into
the draft.
================================================================
Specification of Uniform Resource Names
Karen R. Sollins and Larry Masinter
Mon Nov 29 10:29:37 PST 1993
DRAFT
Presented here are the requirements and functional specification for
_uniform resource names_ (URNs) within the overall architecture of
_Uniform Resource Identification_. In order to build applications in
the most general case, the user must be able to discover and identify
the information, objects, or what we will call in this architecture
resources, on which the application is to operate. As the network and
interconnectivity grows, the ability to make use of remote, perhaps
independently managed, resources will become more and more important.
This activity of discovering and utilizing resources can be broken
down into those activities where one of the primary constraint is
human utility and facility and those in which human involvement is
small or nonexistent. Human naming must have such characteristics as
being both nmemonic and short. Humans, in contrast with computers,
are good at heuristic disambiguation and wide variability in
structure. In order for computer and network based systems to support
global naming and access to resources that have a perhaps
indeterminate lifetime, the flexibility and attendent unreliability of
human-friendly names should be translated into a naming infrasturture
more appropriate for the underlying support system. It is this
underlying support system that the Internet Information Infrastructure
Architecture (IIIA) is addressing.
Within the IIIA, several sorts of information about resources are
specified and divided among different sorts structures, along
functional lines. In order to access information, one must be able to
discover or identify the particular information desired, determine
both how and where it might be used or accessed. The partitioning of
the functionaliy in this architecture is into _uniform resource names_
(URN), _uniform resource characteristics_ (URC), and _uniform resource
locators_ (URL).
The purpose or function of a URN is to provide a globally unique,
persistent identifier used both for recognition and often for
access to characteristics of or access to the resource.
There are two kinds of requirements on URNs: requirement on the
functional capabilities of URNs, and requirements on the encoding of
URNs. Specifically, this leads to the following list of requirements
for URNs functional capabilities:
* Global scope: A URN is a name with global scope which does not
imply a location. It has the same meaning everywhere.
* Global uniqueness: The same name will never be assigned to two
different resources, no matter how separated the objects are.
* Persistence: It is intended that the lifetime of a URN is
permanent. That is, the URN will be globally unique forever, and
may well be used as a reference to a resource well beyond the
lifetime of the resource it identifies or of any naming authority
involved in the assignment of its name.
* Sameness: There exists a mechanism for asking whether two resources
are the same, based on their URNs without going to the naming
authority.
<<DILEMNA: Does this mean
URN1 = URN2 (string compare, but requires a canonicalization to achieve)
or
f(URN1) = g(URN2) (where f and g are known)
- No consensus was reached on the issue of uniqueness / sameness
however the following points were made - This issue arises of the
idea of "intellectual content" - It was agreed that the naming
authority makes decision regarding the intellectual content of the
document. - if an object is moved out of the domain of the naming
authority, do we assign a new URN for it? If we make new copies in
new formats, do we assign new URN? We need to have URN atomically
bound into the resource.
This is still up for discussion. >>>
Resolution: A URN will not impede resolution (translation into a
URL, q.v.). To be more specific, for URNs that have corresponding
URLs, there must be some feasible mechanism to translate a URN to a
URL. This may need further discussion on the list.
* Scalability: URNs can be assigned to any resource that might
conceivably be available on the network, for hundreds of years.
* Grandfathering: The scheme must permit the grandfathering of
existing systems. For example, ISBN numbers, ISO public
identifiers, UPC product codes and the like are naming schemes
which should be allowed to be embedded within the URN system.
* Extensibility: Any scheme for URNs must permit future extensions to
the scheme.
* Caching: Any scheme should not impede caching of resources named by
it, for which caching is appropriate.
<<This is still up for discussion. Under what circumstances is this
true? Can you tell from the URN whether the named resource is
cachable? >>
* Independence: It is solely the responsibility of a name issuing
authority to determine the conditions under which it will issue a
name.
In addition, the following are requirements for URNs as they are encoded:
* Human transcribability: For a URN to be human transcribable it must
not be too long, but may be in a limited alphabet. It is intended
that a URN can be typed on a keyboard, although explicit semantics
accessible to the human are discouraged. Transport friendliness: A
URN can be transported unmodified in the common Internet protocols,
such as TCP, SMTP, FTP, Telnet, etc., as well as printed paper.
Machine consumption: A URN can be parsed by a computer.
The URN specification is a desription of a scheme that meets these
requirements. Some design decisions are a direct result of these
requirements:
* hierarchical design: To satisfy the requirements of uniqueness and
scalability, name assignment is delegated to naming authorities,
who may then assign names directly or delegate that authority to
sub-authorities. Uniqueness is guaranteed by requiring each naming
authority to guarantee uniqueness. The name of the naming
authorities themselves are persistent and globally unique, by means
of a central registry.
Naming authorities that support scalable naming are encouraged, but
not required. Scalable naming is allowable. Scalable implies both
that a scheme for devising names may be scalable at its terminators
as well as within the structure.
The naming authority should be discouraged from but is permitted to
design a naming scheme which is not scalable.
Naming authorities should guarantee that somewhere, somehow, there
is a mapping (or the potential for a mapping) to one or more URLs.
The naming authority itself need not provide the mapping from URN
to URL.
* length limit: There is some debate over whether URNs should have an
explicit limit of length, in order to meet the requirement of
transcribability. character set limitation:For URNs to be
transcribable and transported in mail, it is necessary to limit the
character set usable in URNs, although there is not yet consensus
on what the limit might be.