Re: Draft URL document, for last call to be proposed standard RFC

John A. Kunze (jak@violet.berkeley.edu)
Wed, 10 Aug 1994 19:06:12 -0700

Date: Wed, 10 Aug 1994 19:06:12 -0700
From: jak@violet.berkeley.edu (John A. Kunze)
Message-Id: <199408110206.TAA28479@violet.berkeley.edu>
To: masinter@parc.xerox.com
Subject: Re: Draft URL document, for last call to be proposed standard RFC

> From: Larry Masinter <masinter@parc.xerox.com>
> Date: Wed, 10 Aug 1994 11:17:45 PDT
>
> Re "on the Internet" being an 'untrue' qualifier:
> I think that "on the Internet" isn't a phrase that has a precise
> definition here.

I think that the standard needs to be precise. Even if this were just
user documentation, I'd be uncomfortable with this much imprecision.

> Even in the counterexample you
> cite (the file: scheme), the location of the file is given as an
> Internet host name (FQDN). The 'email' address is an Internet email
> address, and not an X.400 email address.

While I restricted my counter-examples to currently registered URL schemes,
at one time we were considering locators for resources that were not on any
network (eg, non-electronic things like bound books). The present wording
suggests that now we don't envision extending URLs into the non-networked
world at all.

> re encoding hyphens:
> I think it would be a terrible mistake to require hyphens to be
> encoded merely because typographers might mis-typeset them.

I wouldn't use the word "merely" to describe the phenomenon of publishers
doing what they've done for centuries. And whether we call it typesetting
or mis-typesetting, I suspect they will do whatever they please and leave
the technologists to clean up the mess.

Unless I'm really mistaken, training publishers to do what we want
is a lost cause. Right or wrong, it then becomes our responsibility
to avoid inflicting the consequences on users.

> I'm willing to put something in the appendix about hyphens.

That would be very helpful too.

> You know, in
> the reason why it doesn't identify "whitespace" as tabs, newlines,
> carriage returns, etc. is that the white space of a piece of paper
> isn't necessarily any of those.

As far as whitespace is concerned, anticipating what users may enter in
transcribing from paper is certainly worthwhile, but I was thinking more
about cutting and pasting from a display that breaks a URL across line
or page boundaries. The whitespace characters introduced by this method
(a common one I think) are very real.

Otherwise the document is looking good. Thanks for all the work.

-John