Date: Thu, 25 Nov 93 12:18:48 CST
Message-Id: <9311251818.AA23506@boombox.micro.umn.edu>
From: "Mark P. McCahill" <mpm@boombox.micro.umn.edu>
To: uri@bunyip.com
Subject: a better ftp URL
Many people (including me) have complained that the ftp URL in the draft from
last summer needs to have more information so that an ftp client could reliably
de-reference the URL. I haven't heard any concrete proposals for fixing this
URL, so I'm proposing something.
The idea is that by including an access type, the client does not have to
attempt to interpret the path to guess at what are documents, what are
directories, and what needs to be fetched in binary mode... Instead, documents,
directories, ascii and binary transfer mode are explicitly specified. This
means that we don't have to have any hand waving about somehow maping all file
names to Unix-style file names and so can easily accomodate ftp servers running
on systems other than Unix.
I'm posting this right before being away from the net for a week so I won't
be around to discuss this for a while, but it is way past time to start
wrapping up the loose ends on the URL draft.
Here is a proposal/re-write of the ftp discussion in the URL draft.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
FTP
The ftp: prefix indicates a file or directory referenced on the
file system of the given host. The FTP protocol is used to retrieve
the resource. If the port number is specified it gives the port of
the FTP server. When no port number is specified you should use the
default FTP port number.
If a username and password is specified these are to be used to
authenticate with the FTP server. The default, if no username
or password is specified, is to use the anonymous ftp convention
(the user name "anonymous" with the user's mail address as the password).
The access-type is a one-character code that specifies whether the
item being referred to is a file or a directory and the access mode
(ascii, binary, tenex) to be used for retrieval.
Access types defined are:
a file to retrieve in ascii mode
b file to retrieve in binary mode
t file to retrieve in tenex mode
d a directory
The path immediately follows the access-type. By interpreting the access
type, an FTP client knows how to use the path and what ftp commands to
issue. For instance, if the access type is binary file, the FTP client
should issue the "binary" command before sending a get command.
url:ftp://[user [:password] @] host [port]/access-type path
Examples:
The url for a binary file residing on the machine egghead.edu
in /pub/foobar is:
url:ftp://egghead.edu/b/pub/foobar
The directory :hard disk:slow-progress on the machine
snail.committee.com would have this url:
url:ftp://snail.committee.edu/d:hard disk:slow-progress
The file debating-society which is to be retrieved in ascii using
the username nirvana with the password in-utero from the machine
seattle.edu on port 1234 would have this url:
url:ftp://nirvana :in-utero seattle.edu 1234/adebating-society
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Notes:
To make the access-type explicit only requires one character because FTP
doesn't have that many different access types. By putting the character
immediately before the path we can avoid using a seperator since we always know
that the first character is the access-type.
To go for maximum backward compatibility, you could put the access type onto
the end of the path and use something like an encoded <tab> character (%09) as
a seperator. If you also say that the default if no access-type is specified
is an ascii document you have something that is sorta-backward compatible. The
trouble with this is you have to escape the seperator charaacter if it occurs
in the path. For ease of parsing it seems most attractive to always have a
access-type specified and avoid using a seperator so you don't have to escape
the seperator in the path.
Mark P. McCahill
gopherspace engineer/University of Minnesota
mpm@boombox.micro.umn.edu
612 625 1300 (voice) 612 625 6817 (fax)