Date: Fri, 25 Mar 94 13:51:10 +0100
From: Tim Berners-Lee <timbl@ptpc00.cern.ch>
Message-Id: <9403251251.AA15177@ptpc00.cern.ch>
To: "Daniel W. Connolly" <connolly@hal.com>
Subject: Re: FTP syntax
> Date: Wed, 23 Mar 1994 09:46:00 -0600
> From: "Daniel W. Connolly" <connolly@hal.com>
> A question and a suggestion: Is there a consensus on whether
> <fpath> is opaque or can be parsed into pathname components?
> i.e. must a client interpret
> ftp://host/dir1/dir2/file
> as
> RETR /dir1/dir2/file
> or can it do something like
> CWD /dir1
> CWD dir2
> RETR file
My feeling was that people were much happier with the
/ representing a CWD, given that %2F can represent / within a file
name if needed, and anyway that guessing unix syntax is going to
allow a shortcut in a large number of cases. That is what I have put
in the spec, no objections so far.
> Suggestion for mode information:
> ftp://host/dir1/dir2/file;mode=bin
Yes... I like it. When liking it to the FTP spec, I realise we
are usingthe wrong terminology though. Transfer *mode* is something
wich can be (stream, block, etc)... what we are talking about is
FTP's *type*, which can be (A, I).
I feel as editor that terms should be consistent with whatever
we quote, so for FTP I would be very happy to consider
> ftp://host/dir1/dir2/file;type=a
I would even propose it. I think the syntax is much better.
Comments please Cc me directly as the list seems to take 5 hours to
turn round some times.
> When I was exploring lots of possibilities in the test suite, I found
> several cases that motivated leaving * as a data character -- it might
> have special meaning within a scheme, but not across all schemes. I vote
> to keep it in the unreserved set.
The rpoblem is that if it has a special non-opaque meaning
for *any* scheme, we have to reserve it now, or URL
encoders will lose the difference between * and %2A.
For your scheme above, we have the same problem with ;
in that if I specify a file whose name is really "foo.com;1"
as "foo.com%3B1" then any URI gaterway/encoder is free to
decode it to foo.com;1 which screws the FTP uRL syntax.
(I chose ;1 as it is valid VMS, but I could have used
foo.com;type=a).
> For the test suite, I took the set of "data" or "unreserved" characters
> from the isAllowed[] table in HTParse.c. It says the only chars that
> mean the same when excaped and unescaped are:
>
> *-.0123456789@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz
That table will be updated to the safe set when the URL
spec stabilises. It doesn't reflect the current spec.
In Amserdam there was a very vocal group
insisting that almost anything be allowed, hardly anything
by reserved. Space went in even, to come out at Houston.
Your proposal is nice and will be useful for other things,
and in fact is used in the WAIS encoding, so I would suggest
then that ; and = be reserved for encoding attribute value
pairs in whatever scheme. But as the complexity of URLs
is finite and bounded (unlike URCs) the option of
using single character special demiliters (like ! : @ /)
is valid too. WG & net to decide. Editor to listen and write.
[What is missing from the net is a clapometer which is basically
why people still meet.]
Tim