Logical Protocols Was:Re: URL requrements: Structure in string

hallam@alws.cern.ch
Thu, 14 Apr 1994 11:50:09 +0200

Date: Thu, 14 Apr 1994 11:50:09 +0200
Message-Id: <9404140950.AA04535@dxmint.cern.ch>
From: hallam@alws.cern.ch
Subject: Logical Protocols Was:Re: URL requrements: Structure in string

Well I certainly complained!

I think that the lack of hierarchical structure in the gopher URL is not
a minor deficiency but a very major one. From the standpoint of hypertext
it does not matter that much since relative URLs can always be
flattened out and the user will be none the wiser.

But there is more to URLs than just hypertext. As an engineering solution
to the problem of being able to read files from any source rather than the
filestore the idea of extending the naming system throughout the internet
is the best one. But the naming scheme is much more useful than that.

Consider

#include <stdio.h>
#include <http://info.cern.ch/headers/html_dtd.h>

Consider

cd news:soc.*
ls
cd soc.culture.british
ls
cd article@whatever

OK so I cheated in the above since news URLs are not hierarchical (yet).
But the point is that you can do a lot of very nice stuff if you regard http
as giving access to a typed filestore. You can do even nicer stuff if you
include ftp, wais, nntp and mail. It would seem to be a nice idea to be able
to do nice stuff with gopher too. But we absolutely have to have hierarchical
urls.

Hierarchy is even more powerful coupled with logical names. This allows us
to create abstract specifiers, in the same way as on every VMS system in
the universe I can read the system directory using $DIR sys$system regardless
of the physical layout of the disk, it would be nice to be able to reference
things like manuals etc etc in the same way.

One solution to the extension problem is to take any undefined protocol
identifier and translate it as an environment variable. So if we are using a
prospero gateway (having no native prospero support) we define

prosepro :== "http://gateway.cern.ch:56/PROSPERO"

then the URL prospero:/an_item may be translated as

http://gateway.cern.ch:56/PROSPERO/an_item

The most obvious starting point for this would be news which is typically bound
to nntp but which can be bound to http or even a file system in some
circumstances:-

news :== nntp://news.cern.ch:119/

If we have a resource that is not a proper protocol it is useful to distinguish
it so as to avoid collisions. Since the $ symbol is currently unused I suggest
it be applied for this purpose with the following semantics:

For the string $logical

If the symbol $logical is defined then use it
Otherwise take the symbol logical and translate it treating it as a
file specification in the native filing system form into a url
file://whatever/file.

The latter system is most useful on VMS where logical names are part of the file
system anyway so it is common to refer to logical names for example

$ show log sys$examples
"SYS$EXAMPLES" = "SYS$SYSROOT:[SYSHLP.EXAMPLES]" (LNM$SYSTEM_TABLE)
$ dir sys$examples

Directory SYS$COMMON:[SYSHLP.EXAMPLES]

ADDUSER.COM;1 ALIGN_FAULT_DEMO.C;1 ALPHA_LOGGER.C;1
AUDSRV_LISTENER.MAR;1 BACKUSER.COM;1 CLASS.C;1
....

So I would like to be able to refer to the url $SYS$EXAMPLES:adduser.com
and get the file ADDUSER.COM;1 (NB VMS is caseless).

Of course such URLs are process specific, try them in another environment
and you get something else. Not much use eh? This is not so, what we get by
doing this is to allow aliasing of URLs so that we can abstract the filestore
in a process specific way. Remember that a URL is a LOCATOR, the best locator
need not be the same from every starting point. Yes I know this starts to
do other UR* tasks but so what? This can be implemented in very very
little time and will still be massively usefull even with URQ's, whois++++,
CORBA and UJB's

What I would like to see is to progress the URL idea as far as possible,
taking the hypertext as merely one application. It goes without saying that
there should be guidelines for writing hypertext such as :

1) In a hypertext document the form news should always be used in preference
to nntp://server:119/
2) In public hypertext documents logical redirections should only be used if:
a) the redirection is a standard form or
b) the server can provide a translation for the logical to make
it transparent in some manner.

In the case of hypertext we might specify the `logicals' in html. Or they may
be attached in content type information, or one might have to contact the server
to get a translation it thinks best for public use.

BTW this scheme solves a major headache at present, a public document links
to private data. How to stop private links being followed? If we have a
logical protocol $private we can have the two bindings:-

$private :== NULL:// public use
$private :== HTTP://secret.server.ch/ private use

In the document the links have the form:-

<A href=$private:/a_secret_doc>This is private<a>

the NULL protocol should always bring up a response `not found' in reply
to any URL.

This scheme is currently applied using FILE by many people, this has the
disadvantage that text is labeled as an anchor when it cannot be followed
- bush league stuff.

Anyway this is why I can't understand why gopher want to be separate. If
they want to do this then so be it, its just that this stuff will all
colapse when a gopher string is fed in. There is no practical way of
treating gopher differently in such a scheme so all that will hapen is
that pseudo-hierarchical protocols will break.

Phill Hallam-Baker