[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [subversion-dev] Subversion design document up

From: Bill Frantz <frantz_at_communities.com>
Date: 2000-06-08 18:38:47 CEST

At 08:31 PM 6/7/00 -0400, Jonathan S. Shapiro wrote:
>> I like the idea of the SHA-1 hashes for a binary repository very much.
>> I'm
>> not sure that it applies as neatly to a "human readable" repository,
>> where
>> easily identifiable file names/objects would make debugging a hell of a
>> lot
>> easier.
>
>You are right. In building a replicatable repository, it is necessary to
>construct an object name space that does not have collisions.
>Human-generated names don't satisfy this requirement. For subversion, this
>is not an immediate objective, but I wonder if it wouldn't be useful to
>ponder briefly how replication would eventually be implemented in order to
>ensure that the repository name space isn't a problem.
>
>You may be emphasizing the debugging problem unduly. Given the structure of
>the DCMS repository, it is very easy to write a script that will tell you,
>for each reachable object in the repository, the branch, version, and file
>name of that object. That is, a repository browser just isn't that hard to
>build, and in light of this the advantages of universally unique names
>seemed compelling.
>
>Independent of the merits of SHA hashes, it's useful to have the object name
>space and the workspace name space be different. There is no inherent reason
>to believe that the most convenient storage of objects in the repository
>naturally follows from the history of the development. Also, a level of
>indirection at this place in the architecture appears necessary to do a good
>job on rename.

We are playing with a similar system. Two points:

* We are currently using symbolic links to link the human-readable name to
the hashed file. A one-to-many linking structure would be better.

* MacOS (before 9) only supports 31 character names. We use a 5 bit per
character code for representing the binary names, which results in 32
characters for SHA1. Truncating one character might not be a problem, but
we chose to use MD5 instead, leaving room for a reasonable file extension.
(We will need extensions for Windows in our application.)

Cheers - Bill
Received on Sat Oct 21 14:36:05 2006

This is an archived mail posted to the Subversion Dev mailing list.