MD5 and LIFNs (was: Misc Comments)

Alexander Dupuy (dupuy@smarts.com)
Sun, 17 Apr 1994 17:23:41 +0500

Date: Sun, 17 Apr 1994 17:23:41 +0500
From: dupuy@smarts.com (Alexander Dupuy)
Message-Id: <9404172123.AA22118@brainy.smarts.com>
To: uri@bunyip.com, jcurran@nic.near.net
Subject: MD5 and LIFNs (was: Misc Comments)

> o LIFN's for "byte-stream" identification is very important. Shouldn't
> it be possible to now define an "MD5" namespace authority via an
> informational RFC which specifies how to calculate the defacto name
> of any byte-stream?

While this seems like an interesting proposal, I see two problems with it.
The MD5 namespace is non-hierarchical, so a single namespace authority would
have to administer the MD5 names for every published resource in the world;
this is unlikely to scale well.

The second problem is that while it is extremely unlikely for any two given
files that they will share the same MD5 digest, when you increase the numbers
of files, the chance that some pair of files will share the same MD5 digest
increases extremely quickly. This is a variant of the "Birthday paradox"
which is the name for the apparently paradoxical fact that given some number
(roughly 30, I think) of people, the chance that two of them will have the
same birthday is better than 50%. Given the moderate probability that some
two of a few millions of files will share the same MD5 digest, it seems an
inappropriate choice for a namespace.

Given these problems with an "MD5" namespace, I would like to state that I too
feel that some sort of LIFN namespace will be very important and useful.
However, it will have to be hierarchical so that it can scale to include all
published digital works.

@alex