To: uri@bunyip.com
Subject: diffs from IETF-URI-URL-05.txt
From: Larry Masinter <masinter@parc.xerox.com>
Message-Id: <94Aug14.000301pdt.2761@golden.parc.xerox.com>
Date: Sun, 14 Aug 1994 00:02:56 PDT
I tried to be conservative in addressing the comments received.
The diffs are large, however:
* One section was reorganized; while the wording changes
were small, the diffs are large
* I tried to capitalize Gopher consistently
* I removed the reference to Cliff Lynch's paper which he cannot
remember and which a number of us have difficulty accessing.
However, this changed all the reference numbers.
The possibly controversial parts were:
* I reorganized the section on safe, unsafe, reserved characters to
try to avoid some of the confusions that it was (still) engendering.
* I shortened the acknowledgement section to remove some of the
blow-by-blow, while leaving in as many names as were there before.
There's risk of leaving out appropriate credit in any kind of
acknowledgement; I hope no one feels slighted.
* I added a rule that hyphens before end-of-lines when embedded in
text or printed material should be ignored. Please read what it says
before flaming.
=========================================================================
1d0
<
3,4c2,3
< draft-ietf-uri-url-05.txt L. Masinter
< Expires March 4, 1995 M. McCahill
--- > draft-ietf-uri-url-06.txt L. Masinter > Expires March 13, 1995 M. McCahill 6c5 < August 4, 1994--- > August 13, 1994 29c28 < This Internet Draft expires March 4, 1995.--- > This Internet Draft expires March 13, 1995. 35c34 < access of resources on the Internet.--- > access of resources via the Internet. 54c53 < for a resource available on the Internet. Just as there are many--- > for a resource available via the Internet. Just as there are many 61,64c60,65 < The syntax is described in two parts. First, we give the syntax < rules of a completely specified URL; second, we give the rules < under which parts of the URL may be omitted in a well-defined < context.--- > URLs are used to `locate' resources, by providing an abstract > identification of the resource location. Having located a > resource, a system may perform a variety of operations on the > resource, as might be characterized by such words as `access', > `update', `replace', `find attributes'. In general, only the > `access' method needs to be specified for any URL scheme. 72c73 < A the URL contains the name of the scheme being used (<scheme>)--- > A URL contains the name of the scheme being used (<scheme>) 95,98d95 < In any circumstance, only printable ASCII characters are allowed: < URLs may not contain space or other non-printable characters. These < and the character "%" must always be encoded. < 108,119c105,122 < Most characters mean the same thing when represented as themselves < as when represented encoded; however, this is not true for reserved < characters: encoding a reserved character for a particular scheme < may change the semantics of a URL. < < There are a number of characters whose use in URLs is _unsafe_; < characters can be unsafe for a number of reasons. The characters < "<" and ">" are unsafe because they are used as the delimiters < around URLs in free text; the quote mark (""") is used to delimit < URLs in other systems. The character "#" is unsafe because it is < used in World Wide Web and in other systems to delimit a URL from a < fragment identifier that might follow it. Other characters are--- > Usually, a URL has the same interpretation when a byte is > represented by a character and when it is represented by its hex > encoding. However, this is not true for reserved characters: > encoding a reserved character for a particular scheme may change > the semantics of a URL. > > In any circumstance, only printable ASCII characters are allowed in > URLs: URLs may not contain space or other non-printable characters. > > There are a number of printable ASCII characters whose use in URLs > is _unsafe_; characters can be unsafe for a number of reasons. The > characters "<" and ">" are unsafe because they are used as the > delimiters around URLs in free text; the quote mark (""") is used > to delimit URLs in some systems. The character "#" is unsafe, and > should always be encoded, because it is used in World Wide Web and > in other systems to delimit a URL from a fragment/anchor identifier > that might follow it. The character "%" is unsafe because it is > used for encodings of other characters. Other characters are 121,127c124,131 < known to modify such characters. All unsafe characters should < always be encoded within a URL. For example, the character "#" < should always be encoded within URLs, even in systems that do not < normally deal with fragment identifiers, so that if the URL is < copied into another system that does use fragments it will not be < necessary to change the URL encoding. <--- > known to modify such characters. > > All unsafe characters should always be encoded within a URL. For > example, the character "#" should always be encoded within URLs, > even in systems that do not normally deal with fragment or anchor > identifiers, so that if the URL is copied into another system that > does use them, it will not be necessary to change the URL encoding. > 129,131c133,153 < reserved purposes, "$", "-", "_", ".", and "+" are safe and may be < transmitted unencoded. Even so, safe characters _may_ be encoded < within the scheme specific part of a URL.--- > reserved purposes, "$", "-", "_", ".", and "+" may be used > unencoded. > > On the other hand, even safe characters such as alphanumerics _may_ > be encoded, as long as they are not being used for a reserved > purpose. > > 2.3 Hierarchical schemes and relative links > > In some cases, URLs are used to locate resources that contain > pointers to _other_ resources. In some cases, those pointers are > represented as _relative links_ where the expression of the > location of the second resource is in terms of "in the same place > as this one except with the following relative path". Relative > links are not described in this document. However, the use of > relative links depends on the original URL containing a > hierarchical structure against which the relative link is based. > > Some URL schemes (such as the ftp, http, and file schemes) contain > names that can be considered hierarchical; the components of the > hierarchy are separated by "/". 195,198c217,220 < The (optional) port number to connect to. Most schemes < designate protocols that have a default port number. Another < port number may optionally be supplied, in decimal, separated < from the host by a colon.--- > The port number to connect to. Most schemes designate > protocols that have a default port number. Another port number > may optionally be supplied, in decimal, separated from the > host by a colon. If the port is omitted, the colon is as well. 207,208c229,230 < The url-path is interpreted in a manner dependent on the scheme < being used.--- > The url-path syntax depends on the scheme being use, as does the > manner in which it is interpreted. 220,222c242,246 < A user name and password may be supplied. If no user name or < password is supplied and one is requested by the FTP server, the < conventions for "anonymous" FTP are to be used, as follows:--- > A user name and password may be supplied; they are used in the ftp > "USER" and "PASS" commands when initially making the connection to > the FTP server. If no user name or password is supplied and one is > requested by the FTP server, the conventions for "anonymous" FTP > are to be used, as follows: 266,269c290,299 < <URL:ftp://myname@host.dom/etc/motd> which would "CWD etc", < relative to the default directory for "myname", or <URL:ftp: < //myname@host.dom//etc/motd>, which would "CWD " with a null < argument and then "RETR motd".--- > <URL:ftp://myname@host.dom/etc/motd> which would "CWD etc" and then > "RETR motd"; the initial "CWD" might be executed relative to the > default directory for "myname". On the other hand, > <URL:ftp://myname@host.dom//etc/motd>, would "CWD " with a null > argument, then "CWD etc", and then "RETR motd". > > FTP URLs may also be used for other operations; for example, it is > possible to update a file on a remote file server, or infer > information about it from the directory listings. The mechanism for > doing so is not spelled out here. 328c358 < The base Gopher protocol is specified in RFC 1436 and supports--- > The base Gopher protocol is described in RFC 1436 and supports 331c361 < protocol and is specified in [2]. Gopher+ supports associating--- > protocol and is described in [2]. Gopher+ supports associating 356c386 < gopher selector strings are a sequence of 8-bit bytes which may--- > Gopher selector strings are a sequence of 8-bit bytes which may 358,359c388,389 < clients specify which item to retrieve by sending the gopher < selector string to a gopher server.--- > clients specify which item to retrieve by sending the Gopher > selector string to a Gopher server. 364c394 < Note that some gopher <selector> strings begin with a copy of the--- > Note that some Gopher <selector> strings begin with a copy of the 366,368c396,398 < twice consecutively. The gopher selector string may be an empty < string; this is how gopher clients refer to the top-level directory < on a gopher server.--- > twice consecutively. The Gopher selector string may be an empty > string; this is how Gopher clients refer to the top-level directory > on a Gopher server. 372c402 < If the URL refers to a search to be submitted to a gopher search--- > If the URL refers to a search to be submitted to a Gopher search 374,376c404,406 < search string. To submit a search to a gopher search engine, the < gopher client sends the selector string, a tab, and the search < string to the gopher server.--- > search string. To submit a search to a Gopher search engine, the > Gopher client sends the <selector> string (after decoding), a tab, > and the search string to the Gopher server. 380,381c410,411 < URLs for Gopher+ items are have a second encoded tab and a < gopher+ string. Note that in this case, the %09<search> string must--- > URLs for Gopher+ items have a second encoded tab (%09) and a > Gopher+ string. Note that in this case, the %09<search> string must 390,402c420,422 < connect to the server and send the gopher selector, followed < optionally by a tab and the search string, followed by a tab and < the Gopher+ commands. < < More explicitly, if the Gopher+ URL refers to a Gopher search type < (that is, if the Gopher type is 7), the client sends to the gopher < server the gopher selector string, followed by a tab, followed the < search string, followed by a tab, followed by the gopher+ commands. < < If the Gopher+ URL does _not_ refer to a Gopher search (when the < Gopher type is not 7), the Gopher client sends to the server the < gopher selector string, followed by a tab, followed by the gopher+ < commands.--- > connect to the server and send the Gopher selector, followed > optionally by a tab and the search string (if the <search> element > is not empty), followed by a tab and the Gopher+ commands. 407c427 < Gopher+ items are tagged with either a "+" (denoting gopher+ items)--- > Gopher+ items are tagged with either a "+" (denoting Gopher+ items) 421c441 < do this but depend on the "?" tag in the gopher+ item description--- > do this but depend on the "?" tag in the Gopher+ item description 461c481 < The gopher+ string for a URL that refers to an item referenced by--- > The gopher+_string for a URL that refers to an item referenced by 464c484 < The gopher+ string is of the form:--- > The gopher+_string is of the form: 468c488 < To retrieve this item, the gopher client sends:--- > To retrieve this item, the Gopher client sends: 476c496 < to the gopher server.--- > to the Gopher server. 563,564c583,587 < interactive service. In practice, the <user> and <password> < supplied are advisory only.--- > interactive service. Remote interactive services vary widely in the > means by which they allow remote logins; in practice, the <user> > and <password> supplied are advisory only: clients accessing a > telnet URL merely advise the user of the suggested username and > password. 570c593 < described in [6]; the WAIS protocol is described in RFC 1625 [19].--- > described in [6]; the WAIS protocol is described in RFC 1625 [18]. 611c634 < directory path of the form <directory>/<directory>/<name>.--- > directory path of the form <directory>/<directory>/.../<name>. 633c656 < is described elsewhere [16].--- > is described elsewhere [15]. 845,870c868,875 < particularly stimulated by articles by Clifford Lynch (1991), < Brewster Kahle (1991) and Wengyik Yeong (1991b). Contributions from < John Curran (NEARNET), Clifford Neuman (ISI) Ed Vielmetti (MSEN) < and later the IETF URL BOF and URI working group have been < incorporated into this issue of this paper. < < The draft url4 (Internet Draft 00) was generated from url3 < following discussion and overall approval of the URL working group < on 29 March 1993. The paper url3 had been generated from udi2 in < the light of discussion at the UDI BOF meeting at the Boston IETF < in July 1992. Draft url4 was Internet Draft 00. Draft url5 < incorporated changes suggested by Clifford Neuman, and draft url6 < (ID 01) incorporated character group changes and a few other fixes < defined by the IETF URI WG in submitting it as a proposed standard. < URL7 (Internet Draft 02) incorporated changes introduced at the < Amsterdam IETF and refined in net discussion. < < The draft 03 includes changes made at Houston in Nov 93, and on the < net before Seattle March 1994. Draft 04 responded to various < suggestions and remarks made since the Seattle March 1994 meeting, < special thanks to Dan Connolly, Ned Freed, Roy Fielding, and Guido < van Rossum for their careful readings and corrections. Draft 05 < makes a number of minor modifications suggested at or just before < the Toronto July 1994 IETF meeting. This draft incorporates < numerous revisions and edits as suggested by the active members of < the IETF URI Working Group.--- > particularly stimulated by articles by Clifford Lynch, Brewster > Kahle [11] and Wengyik Yeong [19]. Contributions from John Curran, > Clifford Neuman, Ed Vielmetti and later the IETF URL BOF and URI > working group were incorporated. > > Most recently, careful readings and comments by Dan Connolly, Ned > Freed, Roy Fielding, Guido van Rossum and many others have helped > shape the current draft. 893,895c898 < In some cases, extra whitespace may need to be added to break long < URLs across lines. The whitespace is ignored when extracting the < URL. In the case where a fragment identifier is associated with a--- > In the case where a fragment/anchor identifier is associated with a 899c902,918 < Examples--- > In some cases, extra whitespace (spaces, linebreaks, tabs, etc.) > may need to be added to break long URLs across lines. The > whitespace is ignored when extracting the URL. > > Special caution must be used with regard to hyphens: because some > typesetters and printers may erroneously introduce an extraneous > hyphen at the end of line when breaking a line, no whitespace > should be introduced after a "-" character. When extracting the URL > from text or printed material, a hyphen followed by a line break > may be ignored as well. > > Examples: > > Yes, Jim, I found it under <URL:ftp://info.cern.ch/pub/www/doc; > type=d> but you can probably pick it up from <URL:ftp://ds.in- > ternic.net/rfc>. Note the warning in <URL:http://ds.internic. > net/instructions/overview.html#WARNING>. 901,905d919 < Yes, Jim, I found it under <URL:ftp://info.cern.ch/pub/www/doc < ;type=d> but you can probably pick it up from <URL:ftp://ds.inter < nic.net/rfc>. Note the warning in <URL:http://ds.internic.net/ < instructions/overview.html#WARNING>. < 955,956c969,970 < Locators", to be published as RFC????. Available as an internet < draft <URL:ftp://ds.internic.net/internet-drafts/--- > Locators", to be published as an RFC. Available as an Internet > Draft <URL:ftp://ds.internic.net/internet-drafts/ 959,966c973,975 < [14] Lynch, C., (1991) Coalition for Networked Information. < "Workshop on ID and Reference Structures for Networked < Information", November 1991. See < <URL:wais://quake.think.com/wais-discussion-archives?lynch> < < [15] Mockapetris, P. (1987) "Domain Names - Concepts and < Facilities." RFC1034, USC-ISI, < <URL:ftp://ds.internic.net/rfc/rfc1034.txt>--- > [14] Mockapetris, P. (1987) "Domain Names - Concepts and > Facilities." RFC 1034, USC-ISI, <URL:ftp://ds.internic.net/rfc/ > rfc1034.txt> 968c977 < [16] Neuman, B. Clifford, and Augart, Steven (1993). "The Prospero--- > [15] Neuman, B. Clifford, and Augart, Steven (1993). "The Prospero 973c982 < [17] Postel, J. and Reynolds, J. (1985) "File Transfer Protocol--- > [16] Postel, J. and Reynolds, J. (1985) "File Transfer Protocol 976c985 < [18] Sollins, K. and Masinter, L. (1994) "Requirements for Uniform--- > [17] Sollins, K. and Masinter, L. (1994) "Requirements for Uniform 981c990 < [19] St. Pierre, M, et.al., (1994) "WAIS over Z39.50-1988", RFC1625--- > [18] St. Pierre, M, et.al., (1994) "WAIS over Z39.50-1988", RFC1625 984c993 < [20] Yeong, W. (1991) "Towards Networked Information Retrieval",--- > [19] Yeong, W. (1991) "Towards Networked Information Retrieval", 1018,1020d1026 < < <