Message-Id: <9404300556.AA14076@expresso.bunyip.com>
From: Peter Deutsch <peterd@bunyip.com>
Date: Sat, 30 Apr 1994 01:56:22 -0400
In-Reply-To: Mitra's message as of Apr 29, 18:44
To: Mitra <mitra@pandora.sf.ca.us>
Subject: Possible Cool idea...
And the quest for implementation detail continues...
[ Mitra wrote: ]
> On Fri, 29 Apr 1994, Peter Deutsch wrote:
. . .
> > . . . I think you will either agreed upon
> > this out of band, or have a mechanism for looking it up on
> > the fly (eg. a subroutine that understands the TXT record
> > technique, or whatever).
>
> Peter - that's the problem, there is no out-of-band unless we define a
> protocol for it. The client receives a URN, it needs a deterministic set
> of steps to go through to find a set of URLs neither of your alternatives
> "agreed upon this out of band" or "looking it up on the fly" are sets of
> action a client can take - I think, that my message gave an example of
> what a client would have to go through to do this.
By "out of band" I'm merely trying to say "the URI group
has chosen a single protocol" or "there's an entry in the
/etc/services file" or something like that. Of course, it
has to be deterministic to be able to code it but that's
implementation, not architecture.
Architecturally, if we do agree on the protocol a priori
there is no network probe to find it (which I gather is
what you want, yes?). If we don't, then I claim that there
need be at most a single probe to find it, assuming it is
not cached or otherwise available and the mechanism we
come up with puts this info in a sensible place. That's
all I care about, although I understand your need for more
detail before you can code (so see my new proposal below...).
. . .
> > /* ------------------------------------------------------------*\
> >
> > -- subroutine to dereference a URN --
> >
> > Assumes the following routines:
> >
> > resolver_protocol_to_use() - returns ID of protocol to use
> > for this URN server, or ERROR
>
> How - this involves network connections unless the URN contains the
> protocol, (which breaks other rules, like URN's having a longevity greater
> than the systems to deliver them). . . .
Remember, what we're talking about here is choosing the
protocol to use when calling up a server for a specific
URN->URL conversion, not selecting the server for this
conversion (which seems to be pretty well handled by your
proposal to use DNS in assigning subnaming authorities)
Choosing a protocol for a particular authority is
presumably something we do less often then the URN->URL
conversion itself and is a better candidate for either
registration or caching, since presumably it wont change
all that often. In any case, it is a separate task and can
be spec'ed separately.
*** Here's a possible cool idea... ***
We could register a "URN protocol selection" port and then
define that a URN server must repond to any packets coming
in on that port with the names (in standard URL format) of
all available protocols for that server, one per line.
This would eliminate this particular issue once and for
all, and still give us some flexibility to experiment with
protocols at this point. If we use UDP for this, it
shouldn't be all that expensive an operation and such a
query need be done only once for each server (modulo TTLs
on each protocol, which URLs don't currently. Pity...)
Anyways, this technique allows the recipient to send the
query in any one of the formats supported by the URN
resolver. Presumably the exact form of a query in each
case would still need to be specified but for WHOIS++ it
would be trivial (just send "URN=<whatever>" and you're
done). And if ever we do get agreement on a single
protocol, we simply skip this step in the future.
*** end of possible cool idea ***
There's obviously details missing here, but the proposal
does address the "choose a protocol" part of the problem
and it could be written up in a single night (but not by me
and not tonight. Let's get some reaction to the idea
first...)
In practice, this all means that I'd rather fetch the name
of the protocol to use for the actual query from the net,
along with the appropriate IP address, because this allows
me to change either of these later without changing URNs
or worrying about grandfathering. If there _is_ an
associated TTL associated with the protocol part I'd be
more than happy since I could then safely cache the info
with confidence, but that is icing on the cake and we need
not require it at this point.
In any case, if we go with these two ideas (a UDP-based
URN protocol selection port and a DNS-based naming
authority) then if at some point the protocol changes or a
server goes away it doesn't break the URN, only the link
that made it resolvable, since you then wouldn't know what
language to speak or where to go to ask the question at
that point. Still, since resolution was never guaranteed,
only desirable, this is only sad, not the end of the
world. And if you _are_ fetching the protocol and address
from the net, then at least you know where things break
and can take some steps to fix it.
I think we're tending towards an architecture here that
has the properties we need and I don't currently see any
insurmountable obstacles to building it. If you do, please
keep posting and perhaps one of us will see the light...
> . . . Also - how do you decide to use TXT
> records or gethostbyname.
We agree on this mailing list on what technique we're going
to use and strike repeatedly with baseball bats anyone who
disagrees with us... ;-)
Seriously, we need to keep in mind that we have two tasks
here. If you're asking about how to choose a protocol to
speak to the URN server, that should be decided by
something like a vote on this list for a single protocol,
a dedicated URN port serving URLs of protocols and servers,
a structured DNS TXT record, or even having the server
print the URLs of the servers it supports when you first
connect to it. Put simply, we just agree on how to tell the
client what to use and be done with it.
Then, to find the IP address of the URN server to use, you
use gethostbyname() on your cleverly constructed naming
authority part of the URN (or on the host part of that URL
I gave you, if you use my URN port idea). At that point
you have all you need, so you send the opaque string in the
specified format and wait for the reponse. The protocol
used tells you how to interpret the result and you're
done. If ever the world votes on a single protocol, stop
using the first part and it all still works, only faster.
And if you can't handle the specified protocols you will
always have the option of either voting here for a single
protocol (in which case the problem of determining which
to use goes away) or of sending the URN to a proxy server
(which you think will be too slow, but it _could_ be an
option for some when all else fails).
. . .
> Huh ? Where is the "dereference query" in the above stuff.
In either the "process_query()" routine or the
"proxy_srvr()" routine. The actual mechanism would depend
upon the protocol used, but if you assume WHOIS++ and
assume that you're processing it yourself then in
process_query() you would connect and then send in the
opaque string. The response would be a list of URLs
returned in template format. The process_query() routine
would parse this and put it into the appropriate URN
structure and you're done.
If you select "proxy_srvr()" then you call your proxy
server, using whatever protocol you use for this. Once you
connect to the proxy server, you feed it the URN and get
back either a list of URLs or an error code. If error,
return NULL, else parse the list, fill in the URN
structure and return.
Does this help? Is this still looking too obtuse?
. . .
> Unfortunately - I dont believe proxy servers are adequate for this task
> at all, the extra time delays involved in going through a intermediate
> server, which has to go to the real server, are unacceptable in a network
> where these can take of the order of seconds - and that this isnt going
> to even return the document, only figure out where to get it and what
> choices you have.
Perhaps you're right that proxy servers are not
appropriate for _your_ application, but I think there
_will_ be cases where they work fine. I think there are
also a variety of engineering shortcuts we could look at
to speed response here. And of course, barring all else,
we could just get reasonable, agree upon a single protocol
now and eliminate the need for proxies at all. I just don't
think this realistic at this point, given my experience
with the URL debate. I believe we want the WWW people
using these things and thus we'd better be prepared to
accomodate whatever they want to do completely unchanged.
Otherwise, We'll see something built in parallel, as is
now happening with URLs and Tim's URIs (bad Tim, no
doughnut... :-(
I hate doing this but would rather not have to have one
thing working for Tim's users and one for the rest of us
so think we should try to be as accomodating as possible
right now, at least until we've tried things out a bit. If
we're lucky and the one best protocol becomes obvious and
accepted by all, we can eliminate the protocol selection
step and the rest still works just the same. If not, then
whoever doesn't use the most popular protocol will be
paying an added price but it should still work. So, we go
ahead and experiment a bit without commitment to any one
solution at this point.
. . .
> > I assume that if we offer you a frontend onto a generic
> > service as part of "Son of Archie(tm)" presumably we can
> > address this particular concern. It may speak Gopher, HTTP
> > or WHOIS++ (or all three... :-) In any event, I don't see
> > this as something that every client needs to be worried
> > about in the long run. Either one protocol will be chosen,
> > or proxies will exist. I suspect we'll see if I'm being
> > too optimistic about this soon enough...
>
> If servers have to implement a bunch of protocols, then that is slightly
> better, but its a complete waste of time in my opinion.
Then don't do it yourself and don't use the feature
elsewhere. Personally, I don't really think we'll have all
that many protocols in the end, although I do suspect we'll
have a few to start off with and I think we should be
flexible at this point. I think we need to allow Darwinian
selection to work its magic here. Adding an optional
protocol selection step seems to do that while being
relatively painless to turn off in the future.
> I really believe that we are making a simple task unneccessarily complex
> at the protocol level. The smarts should be going into the backend
> resolution, not building clients with multiple protocols, caching, and
> proxy servers and all that.
The idea is to allow us some architectural flexibility so
we can experiment and see what works in practice without
alienating any part of the developement community or
locking in too early to a bad choice. If what comes out of
this phase is a single protocol, without such frills as
caching or proxy servers, great. But I think we do need
multiple protocol support at least in the beginning, and
we do need multiple naming authority support. My
experience with the protocol zealots in the URL debate is
that they wont play if we don't give them their own space,
so I argue we should accomodate them. Can you see another
way to get everyone to agree?
> This is really a very simple process, in my SIMPLE scenario a client only
> needs to be able to call gethostbyname, and send a simple one-packet
> query to a server (which can be as smart/complex as it likes). I dont
> care which protocol we use, I'm committed to implementing it in
> everything I write, but not unless we can settle on ONE - I'd rather
> stick with URL's and gain in speed and client-size.
And with my new proposal you will still have this option,
once a single protocol is selected. Until then, there's a
bit more work to do, if you want to support more than one
protocol. If you choose not to do so, then just try the
one you support and see if it works. Programmers making
choices is what's going to determine the outcome of the
Darwinian selection process, anyways...
- peterd
--
-----------------------------------------------------------------------------
"What do thay got, a whole lot of sand? We got a hot crustacean band!
Each little clam here, know how to jam here! Under the Sea!"
-----------------------------------------------------------------------------