Date: Mon, 28 Nov 1994 14:07:28 -0500
From: hudson@yough.ucs.umass.edu (Rick Hudson)
Message-Id: <9411281907.AA00654@yough.ucs.umass.edu>
To: jak@violet.berkeley.edu, uri@bunyip.com, z3950iw@nervm.nerdc.ufl.edu
Subject: URLs and Z39.50
> Uniform Resource Locators for Z39.50
> 1. Status of this Document
snip
> Distribution of this document is unlimited. Please send comments to
> jak@violet.berkeley.edu, or to the discussion lists uri@bunyip.com and
> z3950iw@nervm.nerdc.ufl.edu.
> 2. Introduction
> Z39.50 is an information retrieval protocol that does not fit neatly into a
> retrieval model designed primarily around the stateless fetch of data.
> Instead, it models a general user inquery as a session-oriented, multi-
> step task, any step of which may be suspended temporarily while the server
> requests additional parameters from the client before continuing. Some,
> none, or all of these client/server interactions may require participation
> of the client user, depending only on the client software (the protocol
> itself make no such requirements).
> On the other hand, retrieval of "well-known" data may be performed in a
> single step, that is, with a degenerate Z39.50 session consisting of
> exactly one protocol search request and response. Besides the basic search
> sub-service, there are several ancillary sub-services (e.g., Scan, Result
> Set Delete). Among the functions covered by combinations of the
> sub-services, two core functions emerge as appropriately handled by two
> separate URL schemes: the Session URL and the Retrieval URL.
> Using two schemes instead of one makes a critical distinction between a
> Z39.50 Session URL, which opens a client session -- leaving the user to
> close it -- and a Z39.50 Retrieval URL, which opens and closes a client
> session, then displays any retrieved results. Making this distinction at
> the scheme level allows the user interface to reflect it on to the user,
> but without actually requiring the user display formatter to parse
> otherwise opaque parts of the URL (consistent with current practice).
> 3. The Z39.50 Session URL
> The Z39.50 Session and Retrieval URLs follow the Common Internet Scheme
> Syntax as defined in RFC ???, "Uniform Resource Locators (URL)" [1]. The
> specific syntax for the Session URL is:
> z39.50s://host[:port] [/database[+database...] [?docid
> [&esn=elementset] [&rs=recordsyntax[+recordsyntax...]]]]
> This may be informally described as providing the mechanism to switch the
> user of the browser to a window in which a Z39.50 client is running.
> - Host is required. - Port is optional, and defaults to 210. - All
> other parameters are optional, however, if docid is present, then database
> must be present. - The Z39.50 client will start a session to the specified
> host/port (alternatively, it need not explicitly start a session, but may
> instead utilize an already open session to the same host/port). - If docid
> is included, the client will perform the specified search (in the same
> manner as for the retrieval URL, specified below). - If docid is not
> included, and other parameters (besides host/port) are specified, the
> client may use those parameters as "hints". Various clients may choose to
> treat them as requirements, or as preferences, or ignore them. - In any
> case (whether a search is performed or not), the client will leave the
> Z39.50 session open for the user, to do retrievals, new searches, etc.
> This is the main distinction from the z39.50r URL.
The other option is to load up the URL with sufficient environment information
to recreate the server z39.50 environment. Similar things are being done today
to simulate browsing around relational databases. If each URL contains enough
information to recreate the environment then the question is whether one can
do it efficiently. Software caches on the server side can greatly reduce the
normal case cost. These caches aren't more expensive than maintaining
connections and like other cache schemes, replacement policy can vary.
If we take this approach then we don't have to rely on every client being
modified to understand these new z39.50s/r protocols. Instead all we need to
do is write a server program that generates the html with the URL holding the
environment information. Another program called session would have to be able
to recreate the environment either by using the cache or some more expensive
mechanism such are redoing the searches. One obvious advantage of such an
approach is that since it is stateless as far as the client/server interaction
is concerned one can go back to previous screens or hold URLs in hotlists that
will function in a reasonable fashion.
This approach would require an http server. The "time to market" of this
approach is much shorter than requiring clients to understand a new protocol.
A typical URL might look like this, (note that this URL could be created
using a FORMS interface or by the 'session' program).
http://host[:port]/cgi-bin/z39.50/session[&cache_id=cache_id]
[&database=database[+database...]][&docid=docid]
[&esn=elementset][&rs=recordsyntax[+recordsyntax...]]]
[&environment=environment_data]
> 4. The Z39.50 Retrieval URL
This is even easier since you don't need to be able to recreate
environments.
Richard L. Hudson - Hudson@cs.umass.edu; (413) 545-1220;
Web Project - Office of Information Technologies
Lederle Graduate Research Center
University of Massachusetts Amherst, MA 01003