Message-Id: <9310171752.AA12558@rs042.scic.intel.com>
Date: Sun, 17 Oct 1993 10:48:31 -0800
To: Kevin Gamiel <kgamiel@vinca.cnidr.org>
From: kevin@scic.intel.com (Kevin Altis)
Subject: Re: The URN: wrapper and URLs...
At 11:45 AM 10/17/93 -0400, Kevin Gamiel wrote:
>> I don't see how prefixing with URL: makes automatic marking particularly
>> easier. Any application scanning through text looking for URLs or URNs is
>> going to have to make a lot of assumptions about where the URL actually
>> starts and ends without a wrapper like < and >. In your example
>
>We hope to have these wrappers...
Great, I think we need them in free text. I wasn't arguing against wrappers.
>> "URL:http://jhm.ccs.neu.edu:7043/" the scanning algorithm still needs to
>> parse the text through the "://" before it knows it has the beginning of an
>> URL, otherwise some text fragment like "valid forms are URL:http, URL:ftp,
>> etc." might foul up the parser; I'm sure we can come up with nastier
>> examples. It seems just as reasonable that the parser should know about
>> specific URL forms "http:, gopher:, ftp:..." rather than just "URL:". The
>
>I don't find this reasonable at all. I want to send a robot off to fetch
>every URL it can find in every text file on the net, regardless of the
>protocol field indicated by "http:, gopher:, ...", and hand the list off
>to a server whose job it is to fetch all those items and see if they have
>what I'm looking for. I shouldn't have to know _every_ protocol supported.
>Ok, bad example, but you get the point.
The parser doesn't necessarily "know" about specific URL forms, bad wording
on my part. What I meant was that adding a prefix of URL: doesn't help in
detecting the URLs even for a human reader. What you have to look for now
is :// then you treat what's before the colon as the scheme type (ftp,
gopher, http, news not nntp) and what's after the last slash as the rest of
the locator. If we have some method of bracketing the URL such as < and >
which I'm in favor of, then the URL: prefix still doesn't do you any good,
since you'll still be looking for the brackets and the ://.
Your server will then have to know about _every_ protocol supported in
order to fetch the items for the robot and again the URL: prefix doesn't
help. You don't have to know every protocol, but the code doing the work
does or has to be able to pass it on to some other code that does. Right
now on the Web, the client software actually knows about more protocols
than any of the common servers.
>I can't believe the traffic that prefixes are causing. One of the
>strongest arguments for "consensus" on URLs in Amsterdam was the fact that
>URLs needed to be easily human distinguishable "...from the back of an
>envelope". The very people who argued this have abandoned the argument a
>level higher.
>
>Besides, its very simple. If there were no working URL code, would anyone
>object to having the URL: prefix? Heck no.
I would. Consider email addresses on the Internet. Do we have to put a
label in front of the email address for either human or code to detect it?
They key element is the @ in name@domain. If there are labels such as To:
From: Cc: you can find the email addresses, but you can also find the
address in somebody's .signature or in the body of a message without
everyone writing email:kevin@ssd.intel.com.
Our current URLs are "easily human distinguishable" or as easy as it gets
without brackets. I'm arguing against the URL: prefix because I don't think
it buys us anything except possibly some sense of balance with UR* labels
to come, but that's what we're debating. I don't think putting URL: in
front of existing URLs makes them any easier to spot. Right now, it appears
the extra URL: prefix is just that, an EXTRA PREFIX.
ka