FPI and Hytime (and URI?)

Dirk Herr-Hoyman (hoymand@joe.uwex.edu)
Mon, 11 Oct 1993 08:00:44 -36803936 (CDT)

From: Dirk Herr-Hoyman <hoymand@joe.uwex.edu>
Message-Id: <9310111300.AA12265@joe.uwex.edu>
Subject: FPI and Hytime (and URI?)
To: uri@bunyip.com
Date: Mon, 11 Oct 1993 08:00:44 -36803936 (CDT)

While I don't want to detract from the wonderful coalescing discusion on
URNs, Tim B-Ls venn diagram brings up a question of FPIs.

_______________________________________________________
| |
| _______________ _______________ |
| | ftp: | | urn: | |
| | gopher: | | fpi: ? | |
| | http: | | | |
| | etc | | | |
| |_______________| |_______________| |
| URLs URNs |
|_______________________________________________________|
URIs

Perhaps not to coincedentally, a discussion about FPI and their use in
HyTime is going on in the "Davenport Group" list. HyTime is the OSI
version of hypertext in SGML.

What's particularly of interest, at least to me, is the syntax of FPIs.
Close to URLs, but not quite the same. And what's add gas to our fire
is the specification of version,format, etc. I don't have any strong
feelings about what the HyTime folks are doing, other than at some point
our "train tracks must meet".

Forwarded message:
> Date: Sat, 9 Oct 1993 17:23:40 -0400
> From: "W. Eliot Kimber" <drmacro@vnet.IBM.COM>
> To: Multiple recipients of list <davenport@ora.com>
> Subject: Version Nos and ISBNs
>
> Ref: Note from MTBRYAN AT CHELTENHAM-HE.AC.UK (attached)
>
> > In all the messages going around about this subject this week not
> > one person has made reference to HyTime's concept of location ladders!
> >
> > While public identifiers based on ISBN's let you identify a particular
> > publication uniquely they do not allow you to pick up a particular item
> > in the book referred to. For that you need a location ladder....
> >
> > All Davenport needs to do to cover versions is to standardize the form of the
> > details of the publication history. This becomes the second element of
> > the HyTime location ladder (the first being the ISBN number) and then
> > you can identify any subcomponent of the identified version by adding
> > further location qualifiers. As far as I can see this technique will be
> > as powerful as any existing referencing conventions, except that it does
> > not allow you to differentiate between issues in serial publications
> > which have been allocated ISSNs or documents that are only published electronically.
>
> I think I understand what you're getting at, but I'm not sure it solves
> the problem that needs solving. Certainly, given a set of objects
> addressed, any one of which might be the actual version we want to get,
> you could use HyTime location elements to locate the version information
> in each one and then use that information to determine which version to
> actually select. The way I would express this in HyTime terms would be
> to define a "version property" whose resolution is defined by the
> processor for a specific DTD or application architecture. This allows
> me to use fairly direct property queries to select among a set of
> choices, without having to worry about the details of how a version
> is encoded for a given DTD.
>
> However, I don't think this is the problem. If we agree that public
> identifiers are the method by which documents will be addressed,
> then in order to address a given document, you must know its FPI *at
> the time you create the reference to it*, in other words, at the
> time to create the entity declaration. If we further assume that, like
> ISBNs, it is necessary to create a new FPI for each different version
> or level (where version distinquishes form and level is release level)
> of a document, then there will never be any ambiguity about the version
> to which a given FPI (and therefore entity reference) refers. It may
> be the case, however, that as new versions are created, our links
> will go out of date. The problem then is how to enable references to
> objects by FPIs that are not precise but that will result in the location
> of objects that have the properties we want even if we don't know
> precisely which object that is at the time the reference is created?
>
> Because ISO 9070 does not define a syntax for defining the version of an
> object, there is no defined mechanism for creating a reference to an object
> that will allow selection among alternatives (e.g., latest version, nearest
> version, version that matches property value X, etc.) in an SGML context.
> I would very much like to see such a defined mechanism. ISO 9070 gives us
> (the SGML community) room to do it because it gives us complete freedom to
> define the structure of object descriptors (more or less). Thus we could
> define a standard for object descriptors that included a mechanism for
> specifying version information that could then be used in such a manner. I
> think we should do this.
>
> It doesn't have to be too complicated. Remembering that the
> overall syntax of an FPI (ignoring character sets for now) is:
>
> +//owner identifier//objecttype objectdescriptor//language
> -
>
> We can define the structure of "objectdescriptor" as, say:
>
> objectname, (("%", formname)?, ("%", digit*, (".", digit*)*)?
>
> Allowing the construction of FPIs of the form:
>
> +//ISBN 0-933186::IBM//DTD IBMIDDoc%1.0//EN
>
> This example refers to version 1.0 of the IBMIDDoc DTD. The
> 'formname' would be an application-defined keyword indicating
> the media or delivery form of the object, e.g., electronic, paperback,
> DynaText, WorldView, ascii, WWW, etc.. The version number is a
> serial number incremented over time, so that larger version numbers
> are always newer versions.
>
> Thus the hierarchy of identification by this convention is:
>
> 1. Object name refers to all versions and forms of a given logical
> object
>
> 2. Object name + version refers to all forms of that version of that
> object
>
> 3. Object name + version + form refers to a precise version in a
> precise form
>
> 4. Object name + form refers to all versions of a given form
>
> 5. Version numbers are fielded from left to right, with each
> dot-delimited field subdividing the fields to the left so that
> 1.1 is newer than 1.0 and 1.1.1 is newer than 1.1.0.
>
> For the purposes of version matching, 1.1 matches all instances of
> 1.1.x, 1 matches all instances of 1.x, etc.
>
> Let us now assume that we have as part of our system an entity manager
> that given some or all of an FPI will return a list of objects whose
> FPIs match. This can be expressed in HyTime terms by defining the
> following properties: FPI, Owner, Type, ObjectName, Version, and Form
> and then using those properties in name queries against our system's set
> of available objects (which could include the entire WWW possibly) to
> get the list of objects. Thus, to locate the latest version of the
> IBMIDDoc Reference in any accessible form, I would do the query "first
> find all objects with Owner="ISBN 0-933186", type="DOCUMENT",
> objectname="IBMIDDoc Reference", then return from that set the one whose
> version property value is greatest. This could be done conveniently by
> defining a set of HyQ queries that take the property values to be
> matched as arguments, something like:
>
> <nameloc id=latest-ibmiddoc>
> <nmquery qdomain="local-document-database"
> fn ="Document_With_Latest_Version"
> args ="'ISBN 0-933186' 'IBMIDDoc Reference'">
> <!-- Args are "owner name" "object name" -->
> </nameloc>
>
> The actual definition of this query would specify "DOCUMENT" as
> the object type and do the inequality comparison on the version
> properties of the objects with owner="ISBN 0-933186" and ObjectName=
> "IBMIDDoc Reference".
>
> We could also allow provide a notation for use in the entity declaration
> itself, relying on the entity manager to resolve the query for us, e.g:
>
> <!ENTITY latest-ibmiddoc PUBLIC
> "query(UseQ(Document_With_LatestVersion
> 'ISBN 0-933186' 'IBMIDDoc Reference'))"
> >
>
> The entity "latest-ibmiddoc" would then be usable just like any
> other entity for cross-document linking (e.g., from the docorsub=
> attribute of Nameloc). I feel justified in using this query as
> the public identifier of the object because the query is against
> properties of objects' public identifiers, or properties that would
> be valid fields of public identifiers.
>
> The important thing to keep in mind, I think, is that the version and
> form of objects are properties of those objects, which means two things:
> First, HyTime has very useful ways of abstracting properties and therfore
> queries against them that allows hiding of the details of how those
> properties are encoded and resolved. Second, it means that since objects
> have these properties, objects should therefore be self describing in
> the sense that any object in the system should know what its version
> and form is. The ability to abstract the encoding and resolution of
> properties means that you don't have to standardize the encoding of
> a given property (e.g., we don't have to agree on the version and
> form markup used in our document types), as long as we agree on the need
> for a given property in general AND the definer of a given encoding
> method provides the method by which their encoding gets resolved (e.g.,
> IBMIDDoc applications provide an API by which they return the version
> of IBMIDDoc documents when asked "give me your version". This is
> essentially the same as defining a high-level API, where in this
> case the communication protocol is HyTime queries refering to property names
> and the data returned are property values. We need only agree that there
> should exist for every document a property "version", define the HyTime
> property sets describing that property, and leave it up to the
> implementors of DTD- or architecture-specific processors to worry about the
> details of transforming "query(DOMTREE proploc(fpiprops[version]))"
> into "get content of Version within ControlInfo within DocTrackingInfo" or
> "get value of Version= attribute on document element" or "query object
> control database to get version number from FPI" or whatever.
>
> --
> Eliot Kimber Internet: drmacro@vnet.ibm.com
> Dept E14/B500 IBMMAIL: USIB2DK9@IBMMAIL
> Network Programs Information Development Phone: 1-919-254-5160
> IBM Corporation
> Research Triangle Park, NC 27709
>

-- 
Dirk Herr-Hoyman                            | 
Internet Publishing Specialist              | 
Electronic Journal of Extension             | Follow your heart! 
  Project Coordinator                       | 
University of Wisconsin-Extension           | (to Florida...) 
hoymand@joe.uwex.edu                        |