Date: Thu, 21 Apr 1994 09:59:58 +0100 (BST)
From: "Jon P. Knight" <J.P.Knight@lut.ac.uk>
Subject: Re: Re urn2urc-00
To: rdaniel@acl.lanl.gov
In-Reply-To: <199404201747.LAA13061@idaknow.acl.lanl.gov>
Message-Id: <Pine.3.05.9404210955.B22786-c100000@suna>
On Wed, 20 Apr 1994 rdaniel@acl.lanl.gov wrote:
> I would like to be able to put the URNs of these annotations into
> the URC of the original source. This is so that if you find, say, an
> interesting research paper, you can easily find out about subsequent
> work in the field. These URNs take space to store, bandwidth to
> send to remote servers, and computational power to search.
Erm, this seems a bit dangerous to me if you mean that you'd like all
citations of a particular paper to be included as links in the original
paper's URC. Firstly this means that there is an ongoing maintanence
problem whereby URCs must be continually updated to have new links
added. Who'll do this? The author(s)? The publishers? Citation
services? National libraries? Secondly, imagine how big the URC for a
popular paper would get (something like Tanenbaum's Ameoba papers or
Birral's RPC description would have huge URC due to the number of other
papers that quote them). Surely I've misunderstood this?
> There are at least two ways around this "censorship" problem.
> One is that we invert the annotations relation so that URCs have a
> list of references rather than annotations. Finding the
> resources that annotate other resources is then a
> computationally expensive process on lots and lots of servers.
> Another approach is to have distributed URCs where no one entity
> has total control over all the content. We still have to consult
> multiple servers, but the URC fragments will presumably point
> to each other, reducing the space and complexity of the search.
>
> If URCs are distributed, obtaining URNs, URLs, and other URC
> fragments once you have a single URC fragment will not be so
> easy. I would assume that each URC fragment has the URN in it, but
> it will not have a full list of URLs and other URC fragments.
>
Ok, what about this: the URC for a document has a pointer to zero or one
``citation'' indexes. The citation index is the thing that is
updated with links to articles that cite the original document.
Citation indexes are under the control of a citation service chosen by
the owner of the URC. Its the job of the citation service to attempt to
grab all references to the original document. To make this job a bit
easier, the document URCs should have a ``references'' field that lists
the other information sources that this document cites. As these
citations are fixed by the document, the list in the URC should need
little or no updating (assuming that wildly different versions of the
same document with different references in them will probably have
different URNs and URCs).
This way we get a sort of doubly linked list in the URCs: reference
links point back to to a fixed number documents in the past, citation
indexes point to a dynamically growing list of documents that have
cited this document in the future. However, as the citation index is
kept separate from the other meta-information in the URC, we prevent the
size of very heavily cited document' URCs growing too fast. The
document meta-information is only distributed to two places; the URC and
the citation index.
Right how did that sound? Or have I just said the same thing again with
different words?
Jon
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Jon Knight, Research Student in High Performance Networking and Distributed
Systems in the Department of _Computer_Studies_ at Loughborough University.
* Its not how big your share is, its how much you share that's important. *