To: miked@cerf.net
In-Reply-To: miked@cerf.net's message of Sat, 6 Aug 1994 18:32:32 -0700 <94Aug6.183241pdt.2760@golden.parc.xerox.com>
Subject: Re: URL Comments
From: Larry Masinter <masinter@parc.xerox.com>
Message-Id: <94Aug6.201248pdt.2760@golden.parc.xerox.com>
Date: Sat, 6 Aug 1994 20:12:41 PDT
> 1. (2.2) This section seems to catagorize characters into legal,
> illegal, and unsafe.
No, it categorizes characters into safe, reserved, unsafe, and
illegal, although there is no real distinction between 'unsafe' and
'illegal'.
> Why can't characters be either legal or not ?
Because the situation is more complicated. Some legal characters have
reserved meaning in some situations, and are only legal when used for
that meaning.
> The prudent implementer would surely escape any characters left in
> doubt; hence might as well spec it that way.
We have discovered that a specification is significantly different
from 'advice to prudent implementors'. Prudent implementors may well
accept syntax that is in fact illegal, and might well avoid emitting
syntax that is legal but possibly confusing to buggy implementations
or readers.
> 2. (3.2.2) The use of / in directory names is interesting (to a UNIX
> and DOS person). Assuming there are systems that allow such a syntax,
> is the problem that (as in your example) it could result in a double
> slash, or is it the ambiguity of using it as both a directory name
> delimiter and part of a name ?
I'm not sure what you mean by 'assuming there are systems that allow
such a syntax'. The specification is careful to map the FTP URLs into
a series of commands that are in the FTP protocol, independent of what
is on the other side of your FTP server. I'm not sure what you think
might be ambiguous. Perhaps you could elaborate.
> I had always assumed that everything after the hostname and subsequent
> / was machine specific. That is, the path would make sense to whatever
> machine was receiving it and there was not a forced use of / as a
> directory delimiter.
Your assumption is common, but not universally held, and, in fact, not
at all universally implemented.
> Overall, FTP is still fuzzy. Perhaps the (future ?) specification of
> the algorithm for GET for FTP would clarify it.
If you can give an example of either a URL whose interpretation is
unclear, or a FTP-accessible file whose URL isn't well defined by this
specification, please give it.
> 3. (3.4.3) typo in the first sentence - "are have".
Thank you. It's been hard to get anyone to review the Gopher+ section
carefully.
> 4. (3.4.x) Interesting that gopher+ attribute encoding is specified.
> In my model of the protocol world, this is equivalent to "HEAD", and
> thus another "virtual method", as opposed to a URL encoding. However
> I'm easy, but following this URL model, then I would propose encoding
> equivalent things for FTP file/directory attributes, and the 822
> attributes of HTTP HEAD, etc. The point is that the model of
> URL-encoding some protocol's attributes and not others seems
> inconsistent. I personally would prefer the "virtual method" of HEAD
> to accomplish this task instead of protocol-specific URL attribute
> encoding.
While there is some appearance of inconsistency, the type attributes
in the Gopher URLs are necessary to determine the type of the Gopher
item and to complete it's access, while this is generally not
necessary for FTP URLs, where the type and other information can be
inferred or obtained by other means. In any case, this is not a
consistency that is particularly important to pursue.
> 5. (3.6, 3.7) Is the form
> ....newsgroup/n1[-n2]
> now illegal ? (where n1 and n2 are article number ranges as used in libwww).
There are many forms used in libwww that are not part of this
specification. The problem with news:group/n1[-n2] is that while group
names and message IDs are not tied to a particular news server,
message numbers are. Thus, while a web browser might know that
"news:group/n1[-n2]" might refer to a particular news server, you
can't reliably put such a URL in your hotlist.
> 6. (3.6, 3.7) Apparently omitted is the form:
> nntp://[host[:port]]/message-id
> There are NNTP efficiencies that would make this very useful. Is there a
> reason it was OK in news but not here ?
I'm not sure, but I believe the point was that 'news:' URLs are
adequate for the case when you have a message-ID.
> 7. (3.6, 3.7) This whole news thing seems messy. I'm sure I've
> missed out on much useful discussion on how news and nntp URL's
> evolved, but could someone provide in a nutshell why the following
> won't work ?
> news://[host[:port]]/*
> news://[host[:port]]/newsgroup[/n1-n2]
> news://[host[:port]]/message-id
> Use whatever protocol is accepted at the local site....
Surely you can't supply a host and port and then 'use whatever
protocol is supported'.
I could imagine an implementation that allowed these, but, as it
stands, 'news' is used for URLs that are site independent, and 'nttp'
for those that specify a particular host and port.
It would be reasonable for a 'news:newsgroup' URL to resolve to a URL
that specified nttp://host/group/n1[-n2], though.
I don't know how to answer your question more fully.
> 8. (3.8) What is the purpose/function of the telnet trailing slash ?
It's widely accepted, but has no particular purpose. There is some
possibility that extensions to telnet will allow additional syntax
after the slash.
> 9. (3.11) Is there a reference implementation of prospero ? Just
> curious why this protocol is specified and (many) other ones are not.
I think which protocols are specified is historical. prospero has been
in the URL document for a long time. I'm not certain whether there is
a reference implementation of prospero.
> ... an algorithm for accessing of resources within that scheme ...
> a. Do such algorithms exist for the currently-defined URLs ? If so,
> where (other than libwww source) ?
I think this is a legitimate question, and I don't want to give a flip
answer. I think the answer is:
yes, in this specification, for ftp, telnet, gopher, mailto, file
not really for http, news, nntp, wais, prospero
>b. Any specification of an algorithm implies a set of functions or
> what could be called "virtual methods" that operate on the URL.
> Are such functions defined ? If not, then I would propose GET, POST,
> and HEAD as a start.
I think this is an interesting request, and quite possible to do,
although I don't think it's required in this RFC, which primarily
attempts to describe the syntax of URLs.
> 12. (4) Any interest in including:
> cso://[host[:port]][/arg=value[;arg=value]]
I think this is reasonable for inclusion in a separate document which
you use to register the 'cso' scheme.