From: stripes@uunet.uu.net (Josh Osborne)
Message-Id: <9402152233.AAwdkk09898@rodan.UU.NET>
Subject: Re: [connolly@hal.com: Re: Identifying scripts by file extension?]
To: masinter@parc.xerox.com (Larry Masinter)
Date: Tue, 15 Feb 1994 17:33:29 -0500 (EST)
In-Reply-To: <94Feb15.104551pst.2732@golden.parc.xerox.com> from "Larry Masinter" at Feb 15, 94 10:45:40 am
[...]
> Currently, the only accepted way to designate a script gateway in a
> URL is to use some magic prefix such as "/cgi-bin/myscript/" to the
> the path. Unfortunately, this co-exists very poorly with the rules
> for forming full URL's from relative ones (see below). What is the
> general opinion on using the file extension to identify gateways or
> scripts?
I don't see why we should attempt to figure out what URL's point to
something that will be executed and return data, and what URL's point
to 'local handles' (file names, message id's...). The only possable
reason is to decide 'what to cache', but if you are fetching things with
a protocall that doesn't have an expire-time (or TTL) on data, then you
shouldn't cache any of it. If it does have an expire-time/TTL the client
should not overide it just because it's 'a script' (for example a script
that checks the tempature outside can be cache'd for a while, the tempature
isn't going to be noticablly diffrent in 5 minutes!).
[...now on to the real problem...]
>Sorry... I haven't been around for a while... I find this interesting.
>I barked a lot _long_ ago about the fact that the URL spec said both
> (1) a url is of the form scheme:string
>where string is opaque, and
> (2) a url is of the form scheme://host/dir/dir/dir/file
> or just /dir/dir/dir/file
> or ../dir/file
> or ../dir/file#id
> or just #idd
> blah blah blah
>
>The grammar in the URL spec is highly ambiguous. For example, how
>does one parse the following?
>
> news:lkjlsdf#lksjdf@hal.com
A proposal: make '/' a seporator, leading ..'s have more-or-less Unix
semantics, any ..'s founder later in the URL need not have Unix semantics.
URL's that don't currently start with /, and might start ../ should be
changed to start with a / (news:/lkjlsdf#lksjdf@hal.com). That way a
client can detect cache hits (foo:../bar == foo:/baz/bar), and URL's
can stay (mostly) opaque.
[...]
>I will again assert that what we should use the SGML parser to do
>whatever parsing is going to be done on the client side, and make the
>results opaque to the client, thereby allowing the server to use _any_
>string it wants to encode info. We should also _allow_ a link to
>contain content-type information. (How else do I link to a postscript
>file on an ftp archive? By file extension? Come on!)
The problem with this is that URL's can be used in non-SGML envirmoents.
(Where else are they used? Well primarally bar napkins, or the equivolent,
but who is to say URL's don't outlive SGML? At least on the Internet.).
[...lots of suggestions and problems deleated...]
>But it's VERY important that we standardize on which parts of a URL
>are opaque, and which are not. The current strategy is breaking down.
Yes, it is. We should keep in mind:
* we want to encurage client caching
* we want to be able to use URL's to point to random types of info avail.
through non-typed protocalls (like FTP)
* we don't want to tie URL's too closely with other standards (SGML)
* we would like to preserve existing practice as much as possable, or
provide a migration path. (this is, IMHO, the least important objectave;
assuming the Web is growing 400% every six months, a drastic change that
makes existing use obsolete will, a year from now, have only effected one
eighth of the Web population... really pissing them off, and possabbly
stopping growth of the Web for a while 'tho)