diffs from IETF-URI-URL-05.txt

Larry Masinter (masinter@parc.xerox.com)
Sun, 14 Aug 1994 00:02:56 PDT

To: uri@bunyip.com
Subject: diffs from IETF-URI-URL-05.txt
From: Larry Masinter <masinter@parc.xerox.com>
Message-Id: <94Aug14.000301pdt.2761@golden.parc.xerox.com>
Date: Sun, 14 Aug 1994 00:02:56 PDT

I tried to be conservative in addressing the comments received.
The diffs are large, however:

* One section was reorganized; while the wording changes
were small, the diffs are large
* I tried to capitalize Gopher consistently
* I removed the reference to Cliff Lynch's paper which he cannot
remember and which a number of us have difficulty accessing.
However, this changed all the reference numbers.

The possibly controversial parts were:

* I reorganized the section on safe, unsafe, reserved characters to
try to avoid some of the confusions that it was (still) engendering.
* I shortened the acknowledgement section to remove some of the
blow-by-blow, while leaving in as many names as were there before.
There's risk of leaving out appropriate credit in any kind of
acknowledgement; I hope no one feels slighted.
* I added a rule that hyphens before end-of-lines when embedded in
text or printed material should be ignored. Please read what it says
before flaming.

=========================================================================
1d0
<
3,4c2,3
< draft-ietf-uri-url-05.txt L. Masinter
< Expires March 4, 1995 M. McCahill

---
> draft-ietf-uri-url-06.txt                                    L. Masinter
> Expires March 13, 1995                                       M. McCahill
6c5
<                                                           August 4, 1994
---
>                                                          August 13, 1994
29c28
<      This Internet Draft expires March 4, 1995.
---
>      This Internet Draft expires March 13, 1995.
35c34
<    access of resources on the Internet.
---
>    access of resources via the Internet.
54c53
<    for a resource available on the Internet. Just as there are many
---
>    for a resource available via the Internet. Just as there are many
61,64c60,65
<    The syntax is described in two parts. First, we give the syntax
<    rules of a completely specified URL; second, we give the rules
<    under which parts of the URL may be omitted in a well-defined
<    context.
---
>    URLs are used to `locate' resources, by providing an abstract
>    identification of the resource location.  Having located a
>    resource, a system may perform a variety of operations on the
>    resource, as might be characterized by such words as `access',
>    `update', `replace', `find attributes'. In general, only the
>    `access' method needs to be specified for any URL scheme.
72c73
<    A the URL contains the name of the scheme being used (<scheme>)
---
>    A URL contains the name of the scheme being used (<scheme>)
95,98d95
<    In any circumstance, only printable ASCII characters are allowed:
<    URLs may not contain space or other non-printable characters. These
<    and the character "%" must always be encoded.
<    
108,119c105,122
<    Most characters mean the same thing when represented as themselves
<    as when represented encoded; however, this is not true for reserved
<    characters: encoding a reserved character for a particular scheme
<    may change the semantics of a URL.
< 
<    There are a number of characters whose use in URLs is _unsafe_;
<    characters can be unsafe for a number of reasons.  The characters
<    "<" and ">" are unsafe because they are used as the delimiters
<    around URLs in free text; the quote mark (""") is used to delimit
<    URLs in other systems. The character "#" is unsafe because it is
<    used in World Wide Web and in other systems to delimit a URL from a
<    fragment identifier that might follow it.  Other characters are
---
>    Usually, a URL has the same interpretation when a byte is
>    represented by a character and when it is represented by its hex
>    encoding. However, this is not true for reserved characters:
>    encoding a reserved character for a particular scheme may change
>    the semantics of a URL.
> 
>    In any circumstance, only printable ASCII characters are allowed in
>    URLs: URLs may not contain space or other non-printable characters.
> 
>    There are a number of printable ASCII characters whose use in URLs
>    is _unsafe_; characters can be unsafe for a number of reasons.  The
>    characters "<" and ">" are unsafe because they are used as the
>    delimiters around URLs in free text; the quote mark (""") is used
>    to delimit URLs in some systems.  The character "#" is unsafe, and
>    should always be encoded, because it is used in World Wide Web and
>    in other systems to delimit a URL from a fragment/anchor identifier
>    that might follow it.  The character "%" is unsafe because it is
>    used for encodings of other characters.  Other characters are
121,127c124,131
<    known to modify such characters. All unsafe characters should
<    always be encoded within a URL. For example, the character "#"
<    should always be encoded within URLs, even in systems that do not
<    normally deal with fragment identifiers, so that if the URL is
<    copied into another system that does use fragments it will not be
<    necessary to change the URL encoding.
<     
---
>    known to modify such characters.
> 
>    All unsafe characters should always be encoded within a URL. For
>    example, the character "#" should always be encoded within URLs,
>    even in systems that do not normally deal with fragment or anchor
>    identifiers, so that if the URL is copied into another system that
>    does use them, it will not be necessary to change the URL encoding.
> 
129,131c133,153
<    reserved purposes, "$", "-", "_", ".", and "+" are safe and may be
<    transmitted unencoded. Even so, safe characters _may_ be encoded
<    within the scheme specific part of a URL.
---
>    reserved purposes, "$", "-", "_", ".", and "+" may be used
>    unencoded.
> 
>    On the other hand, even safe characters such as alphanumerics _may_
>    be encoded, as long as they are not being used for a reserved
>    purpose.
> 
> 2.3 Hierarchical schemes and relative links
> 
>    In some cases, URLs are used to locate resources that contain
>    pointers to _other_ resources. In some cases, those pointers are
>    represented as _relative links_ where the expression of the
>    location of the second resource is in terms of "in the same place
>    as this one except with the following relative path". Relative
>    links are not described in this document. However, the use of
>    relative links depends on the original URL containing a
>    hierarchical structure against which the relative link is based.
> 
>    Some URL schemes (such as the ftp, http, and file schemes) contain
>    names that can be considered hierarchical; the components of the
>    hierarchy are separated by "/".
195,198c217,220
<         The (optional) port number to connect to. Most schemes
<         designate protocols that have a default port number. Another
<         port number may optionally be supplied, in decimal, separated
<         from the host by a colon.
---
>         The port number to connect to. Most schemes designate
>         protocols that have a default port number. Another port number
>         may optionally be supplied, in decimal, separated from the
>         host by a colon. If the port is omitted, the colon is as well.
207,208c229,230
<    The url-path is interpreted in a manner dependent on the scheme
<    being used.
---
>    The url-path syntax depends on the scheme being use, as does the
>    manner in which it is interpreted.
220,222c242,246
<    A user name and password may be supplied. If no user name or
<    password is supplied and one is requested by the FTP server, the
<    conventions for "anonymous" FTP are to be used, as follows:
---
>    A user name and password may be supplied; they are used in the ftp
>    "USER" and "PASS" commands when initially making the connection to
>    the FTP server.  If no user name or password is supplied and one is
>    requested by the FTP server, the conventions for "anonymous" FTP
>    are to be used, as follows:
266,269c290,299
<    <URL:ftp://myname@host.dom/etc/motd> which would "CWD etc",
<    relative to the default directory for "myname", or <URL:ftp:
<    //myname@host.dom//etc/motd>, which would "CWD " with a null
<    argument and then "RETR motd".
---
>    <URL:ftp://myname@host.dom/etc/motd> which would "CWD etc" and then
>    "RETR motd"; the initial "CWD" might be executed relative to the
>    default directory for "myname". On the other hand,
>    <URL:ftp://myname@host.dom//etc/motd>, would "CWD " with a null
>    argument, then "CWD etc", and then "RETR motd".
> 
>    FTP URLs may also be used for other operations; for example, it is
>    possible to update a file on a remote file server, or infer
>    information about it from the directory listings. The mechanism for
>    doing so is not spelled out here.
328c358
<    The base Gopher protocol is specified in RFC 1436 and supports
---
>    The base Gopher protocol is described in RFC 1436 and supports
331c361
<    protocol and is specified in [2]. Gopher+ supports associating
---
>    protocol and is described in [2]. Gopher+ supports associating
356c386
<    gopher selector strings are a sequence of 8-bit bytes which may
---
>    Gopher selector strings are a sequence of 8-bit bytes which may
358,359c388,389
<    clients specify which item to retrieve by sending the gopher
<    selector string to a gopher server.
---
>    clients specify which item to retrieve by sending the Gopher
>    selector string to a Gopher server.
364c394
<    Note that some gopher <selector> strings begin with a copy of the
---
>    Note that some Gopher <selector> strings begin with a copy of the
366,368c396,398
<    twice consecutively. The gopher selector string may be an empty
<    string; this is how gopher clients refer to the top-level directory
<    on a gopher server.
---
>    twice consecutively. The Gopher selector string may be an empty
>    string; this is how Gopher clients refer to the top-level directory
>    on a Gopher server.
372c402
<    If the URL refers to a search to be submitted to a gopher search
---
>    If the URL refers to a search to be submitted to a Gopher search
374,376c404,406
<    search string. To submit a search to a gopher search engine, the
<    gopher client sends the selector string, a tab, and the search
<    string to the gopher server.
---
>    search string. To submit a search to a Gopher search engine, the
>    Gopher client sends the <selector> string (after decoding), a tab,
>    and the search string to the Gopher server. 
380,381c410,411
<    URLs for Gopher+ items are have a second encoded tab and a
<    gopher+ string. Note that in this case, the %09<search> string must
---
>    URLs for Gopher+ items have a second encoded tab (%09) and a
>    Gopher+ string. Note that in this case, the %09<search> string must
390,402c420,422
<    connect to the server and send the gopher selector, followed
<    optionally by a tab and the search string, followed by a tab and
<    the Gopher+ commands.
< 
<    More explicitly, if the Gopher+ URL refers to a Gopher search type
<    (that is, if the Gopher type is 7), the client sends to the gopher
<    server the gopher selector string, followed by a tab, followed the
<    search string, followed by a tab, followed by the gopher+ commands.
< 
<    If the Gopher+ URL does _not_ refer to a Gopher search (when the
<    Gopher type is not 7), the Gopher client sends to the server the
<    gopher selector string, followed by a tab, followed by the gopher+
<    commands.
---
>    connect to the server and send the Gopher selector, followed
>    optionally by a tab and the search string (if the <search> element
>    is not empty), followed by a tab and the Gopher+ commands.
407c427
<    Gopher+ items are tagged with either a "+" (denoting gopher+ items)
---
>    Gopher+ items are tagged with either a "+" (denoting Gopher+ items)
421c441
<    do this but depend on the "?" tag in the gopher+ item description
---
>    do this but depend on the "?" tag in the Gopher+ item description
461c481
<    The gopher+ string for a URL that refers to an item referenced by
---
>    The gopher+_string for a URL that refers to an item referenced by
464c484
<    The gopher+ string is of the form:
---
>    The gopher+_string is of the form:
468c488
<    To retrieve this item, the gopher client sends:
---
>    To retrieve this item, the Gopher client sends:
476c496
<    to the gopher server.
---
>    to the Gopher server.
563,564c583,587
<    interactive service. In practice, the <user> and <password>
<    supplied are advisory only.
---
>    interactive service. Remote interactive services vary widely in the
>    means by which they allow remote logins; in practice, the <user>
>    and <password> supplied are advisory only: clients accessing a
>    telnet URL merely advise the user of the suggested username and
>    password.
570c593
<    described in [6]; the WAIS protocol is described in RFC 1625 [19].
---
>    described in [6]; the WAIS protocol is described in RFC 1625 [18].
611c634
<    directory path of the form <directory>/<directory>/<name>.
---
>    directory path of the form <directory>/<directory>/.../<name>.
633c656
<    is described elsewhere [16].
---
>    is described elsewhere [15].
845,870c868,875
<    particularly stimulated by articles by Clifford Lynch (1991),
<    Brewster Kahle (1991) and Wengyik Yeong (1991b). Contributions from
<    John Curran (NEARNET), Clifford Neuman (ISI) Ed Vielmetti (MSEN)
<    and later the IETF URL BOF and URI working group have been
<    incorporated into this issue of this paper.
< 
<    The draft url4 (Internet Draft 00) was generated from url3
<    following discussion and overall approval of the URL working group
<    on 29 March 1993. The paper url3 had been generated from udi2 in
<    the light of discussion at the UDI BOF meeting at the Boston IETF
<    in July 1992. Draft url4 was Internet Draft 00. Draft url5
<    incorporated changes suggested by Clifford Neuman, and draft url6
<    (ID 01) incorporated character group changes and a few other fixes
<    defined by the IETF URI WG in submitting it as a proposed standard.
<    URL7 (Internet Draft 02) incorporated changes introduced at the
<    Amsterdam IETF and refined in net discussion.
< 
<    The draft 03 includes changes made at Houston in Nov 93, and on the
<    net before Seattle March 1994.  Draft 04 responded to various
<    suggestions and remarks made since the Seattle March 1994 meeting,
<    special thanks to Dan Connolly, Ned Freed, Roy Fielding, and Guido
<    van Rossum for their careful readings and corrections.  Draft 05
<    makes a number of minor modifications suggested at or just before
<    the Toronto July 1994 IETF meeting. This draft incorporates
<    numerous revisions and edits as suggested by the active members of
<    the IETF URI Working Group.
---
>    particularly stimulated by articles by Clifford Lynch, Brewster
>    Kahle [11] and Wengyik Yeong [19]. Contributions from John Curran,
>    Clifford Neuman, Ed Vielmetti and later the IETF URL BOF and URI
>    working group were incorporated.
> 
>    Most recently, careful readings and comments by Dan Connolly, Ned
>    Freed, Roy Fielding, Guido van Rossum and many others have helped
>    shape the current draft.
893,895c898
<    In some cases, extra whitespace may need to be added to break long
<    URLs across lines. The whitespace is ignored when extracting the
<    URL. In the case where a fragment identifier is associated with a
---
>    In the case where a fragment/anchor identifier is associated with a
899c902,918
<    Examples
---
>    In some cases, extra whitespace (spaces, linebreaks, tabs, etc.)
>    may need to be added to break long URLs across lines. The
>    whitespace is ignored when extracting the URL.
> 
>    Special caution must be used with regard to hyphens: because some
>    typesetters and printers may erroneously introduce an extraneous
>    hyphen at the end of line when breaking a line, no whitespace
>    should be introduced after a "-" character. When extracting the URL
>    from text or printed material, a hyphen followed by a line break
>    may be ignored as well.
> 
>    Examples:
> 
>       Yes, Jim, I found it under <URL:ftp://info.cern.ch/pub/www/doc;
>       type=d> but you can probably pick it up from <URL:ftp://ds.in-
>       ternic.net/rfc>.  Note the warning in <URL:http://ds.internic.
>       net/instructions/overview.html#WARNING>.
901,905d919
<    Yes, Jim, I found it under <URL:ftp://info.cern.ch/pub/www/doc
<    ;type=d> but you can probably pick it up from <URL:ftp://ds.inter
<    nic.net/rfc>.  Note the warning in <URL:http://ds.internic.net/
<    instructions/overview.html#WARNING>.
< 
955,956c969,970
<        Locators", to be published as RFC????. Available as an internet
<        draft <URL:ftp://ds.internic.net/internet-drafts/
---
>        Locators", to be published as an RFC. Available as an Internet
>        Draft <URL:ftp://ds.internic.net/internet-drafts/
959,966c973,975
<   [14] Lynch, C., (1991) Coalition for Networked Information.
<        "Workshop on ID and Reference Structures for Networked
<        Information", November 1991. See
<        <URL:wais://quake.think.com/wais-discussion-archives?lynch>
< 
<   [15] Mockapetris, P. (1987) "Domain Names - Concepts and 
<        Facilities." RFC1034, USC-ISI,
<        <URL:ftp://ds.internic.net/rfc/rfc1034.txt>
---
>   [14] Mockapetris, P. (1987) "Domain Names - Concepts and
>        Facilities." RFC 1034, USC-ISI, <URL:ftp://ds.internic.net/rfc/
>        rfc1034.txt>
968c977
<   [16] Neuman, B. Clifford, and Augart, Steven (1993). "The Prospero
---
>   [15] Neuman, B. Clifford, and Augart, Steven (1993). "The Prospero
973c982
<   [17] Postel, J. and Reynolds, J. (1985) "File Transfer Protocol 
---
>   [16] Postel, J. and Reynolds, J. (1985) "File Transfer Protocol 
976c985
<   [18] Sollins, K. and Masinter, L. (1994) "Requirements for Uniform
---
>   [17] Sollins, K. and Masinter, L. (1994) "Requirements for Uniform
981c990
<   [19] St. Pierre, M, et.al., (1994) "WAIS over Z39.50-1988", RFC1625
---
>   [18] St. Pierre, M, et.al., (1994) "WAIS over Z39.50-1988", RFC1625
984c993
<   [20] Yeong, W. (1991) "Towards Networked Information Retrieval",
---
>   [19] Yeong, W. (1991) "Towards Networked Information Retrieval",
1018,1020d1026
< 
< 
<