Message-Id: <9403281805.AA06912@ulua.hal.com>
To: timbl@www0.cern.ch
Subject: Re: URL spec changes
In-Reply-To: Your message of "Mon, 28 Mar 1994 16:13:50 +0200."
<9403281413.AA00427@ptpc00.cern.ch>
Date: Mon, 28 Mar 1994 12:05:33 -0600
From: "Daniel W. Connolly" <connolly@hal.com>
In message <9403281413.AA00427@ptpc00.cern.ch>, Tim Berners-Lee writes:
>
>> <<================================================================
>> While the protocol determines the interpretation of the path,
>> generally, the slash "/" denotes a level in a hierarchical structure.
>> ================================================================>>
>
>Ok.
Ack!! Thtptpt!! Barf! Big Lose.
So it is decided: a URL is just a scheme and an opaque string (over a
limited character set) called the path, and the interpretation of the
path is determined by the scheme.
This means, for exmaple changing %2F to / or space to %20 in URLs with
unrecognized schemes is not well-defined. Resolving relative HREF's is
dependent on the scheme. Reducing a URL to canonical form may work
differently for different schemes.
Does the WWW team plan to implement these changes? They're certainly
incompatible with the code I've seen. If one looks at HTParse.c in
libwww, it would appear that on can resolve relative HREFs and extract
any of the following from an arbitrary URL, regardless of scheme:
#define PARSE_ACCESS 16
#define PARSE_HOST 8
#define PARSE_PATH 4
#define PARSE_ANCHOR 2
#define PARSE_PUNCTUATION 1
#define PARSE_ALL 31
Perhaps we are creating a new beast alltogether. The HTML HREF attribute
is not exactly a URL; it's either:
* a URL whose scheme adheres to the %2F and scheme://host/path?search
conventions, or
* a relative address, or
* a URL (as above) plus a fragment identifier
The WWW architecture should at least be sure that the above cases are
consistently distinguishable. For example:
HREF="ftp://host/file#x/yz"
Is that (1) illegal (2) file "yz" in directory "file#x" or (3) fragment
"x/yz" in "file."?
I strongly oppose watering down the URL spec like this.
Dan