Date: Mon, 7 Jun 93 16:52:50 +0200
From: Tim Berners-Lee <timbl@www3.cern.ch>
Message-Id: <9306071452.AA05893@www3.cern.ch>
To: e-krol@uiuc.edu (Ed Krol)
Subject: Re: Suggest meaning for URN
Ed:
>Now it seems that 1. yes what we got is a symbolic
>reference and 2. a table lookup based ont that symbol is
conceptually
>correct but it doesn't scale well.
> ... So what I think is
>happening is that there is are two parts to a URN.
...
> when in doubt recurse.
Looks like we have a "lookup" operator (say "/" ) which
where a/b means "Contact a and ask him what he makes of B".
Like iana/isbn/123456789X. Note two interesting ways
of parsing this:
A: The opaque case:
iana/(isbn/123456789X)
Ask IANA about this isbn/12345679x. The result from
IANA may be a refernce to a server, say
128.141.2.3/123456789X, in which case
we have found an ISBN server and we go use it.
But the iana server _could_ have given us a URL for
the document straight off.
Advantage:
Information hiding in second part gives flexibility.
B: The visible case:
(iana/isbn)/123456789X
Ask IANA what ISBN is. The result may be that ISBN is a
server at 128.141.2.3 which can handle 123456789X. The client
then uses its knowledge about (iana/isbn) to resolve the
third part which it can see.
Disadvantage: The client has to know more.
Advantage:
One can cache the intermediate result.
This makes the caching work much better.
Taking it futher, suppose I am reading
iana/isbn/123456789X/ch4_mangles/fig4
and i get a reference to
iana/isbn/123456789x/apxA_worzles/table6
then this is the sort of link I would expect to be
able to resolve fast. I can in case B, as I
will have cached the internet address for the
ISBN authority, and the server address for the book,
so I can go straight to that server and ask for
the new table.
>Now what we have is an optimization problem. And here it gets
stickier.
>There are two obvious ways to help this out and one which I think is
useless:
>
>1. Cacheing. I'm not convinced that the number of repeat hits on a
resource
> will ever be high enough regardless of size of cache to make this
> viable.
Method B solves this largely I think.
>2. Adding a "possible URL to the URN". So the above becomes a
> URN (to the max) ::= URN(sub S) URN(sub R) URL?
> Wher the URL is tried first and if it succeeds then the whole
> lookup process is bypassed. Fast, but the problem is that it
> is possible that over time the URL may point to a reasonable
resource
> but not the resource that the URN refers to (file contents get
changed).
This is begging the question. Sure, we can include
URLs if the URN mechanism can't deliver. But lets
try to get it to deliver first.
>3. I don't think adding additional structure to the URN(sub s) helps
> much. I think the number of servers would be quite manageable in
> a flat space and the added structure will only speed things up by
> a small delta.
Yes, I agree that a very flat structure (URNs started off as
2, now seem to be 3, levels) is fine because we can make
the fan-out at each level large like 10**5. But the caching
argument is an important point in scaling. So I am
suggesting that the structure be VISBLE, not DEEP.
This visibility conflicts with the ways URNs were going
in Columbus.
What do you think?
Tim