URLs, URIs, and references

Larry Masinter (masinter@parc.xerox.com)
Sun, 23 May 1993 22:06:19 PDT

To: uri@bunyip.com
In-Reply-To: winograd@interval.com's message of Sat, 22 May 1993 16:50:15 -0700 <93May22.165019pdt.2741@golden.parc.xerox.com>
Subject: URLs, URIs, and references
From: Larry Masinter <masinter@parc.xerox.com>
Message-Id: <93May23.220629pdt.2741@golden.parc.xerox.com>
Date: Sun, 23 May 1993 22:06:19 PDT

Maybe it would avoid some of the politics to say, instead of 'URLs
should contain types' to say:

references to data should be typed

or perhaps, in a more object oriented way:

references to data should support not only the 'fetch
this thing' operation, but also an operation of 'what is
the (content-) type of this thing'.

Current URL syntax contains an implicit method for discovering the
types of most references:

http urls are presumed to be HTML in http1, and have a
mechanism for discovering/negotiating type in http2.

gopher urls contain a little code at the beginning which
indicates the type.

telnet urls are implicitly 'interactive telnet sessions',
but no subtypes are allowed (e.g., things like negotiating
terminal types is left to be done after you've connected.)

ftp urls aren't really typed, but most programs assume
they are by looking at the end part of the file name and
guessing the file type by the extension.

Maybe this adhoc-ery is OK, but it bothers me a little, since every
protocol has its own ad-hoc way of specifying type, and there's no way
to supply type information along with a reference, e.g., 'ftp this
file and treat it as audio even though it isn't named something.au'.

So, (a) if I *wanted* to supply type information along with a FTP
reference, how could I do it? Could we allow somehow the auxiliary
type information into the URL syntax? and (b) perhaps this is a more
reasonable way to specify type codes for gopher data? E.g., go on and
expand out gopher 0 1 W I S etc. types into their corresponding Mime
content-types?

I think I'd go along with (a) without (b): leave the current ad-hoc
typing mechanisms in place, but at least allow something else.

================================================================
Well, this was 'in-reply-to' Terry's message, but I haven't really
replied to it. The only problem with associating 'dates' with
references is that there are a couple of dates you're really
interested in:

(a) the date the referenced item was created... e.g., the write date
of the FTP archive, gopher entry, etc. This isn't as good as a
URN, but does allow some amount of caching. Some URLs don't have
'creation date', e.g., telnet urls, or automatically created ones.

(b) the date the reference itself was created... e.g., each time a
menu or search returns a set of URLs, timestamp them then. At least
you'll know when the data *was* valid.

(c) the date the reference expires. This would be nice -- you'd know
when you shouldn't bother following a link any more. Unfortunately,
there's usually no way to know this!

In Terry's example, you really wanted to know (c): at what point do
file references in FAQ files become obsolete? Knowing (b) --- the date
of the FAQ file -- could give you a hint as to *why* the referenced
data wasn't there anymore, but it doesn't really keep you from
looking.

Anyway, I'm a little more wary of supplying dates with URLs because of
the ambiguity of what the date represents, and what you expect to do
with it.