"Brian W. Fitzpatrick" <fitz@collab.net> writes:
> Variation A: The Hash/ASCII Experiment
>
> Summary: Force each path component to some 127bit ASCII representation
> and store locks the filesystem under a lowercase "ASCII-ized" name.
> Use a hash for names that can't be efficiently "ASCII-ized".
>
> Details:
>
> The path to a lock is broken into its components, and each path
> component is translated into its lowercase ASCII equivalent (Using the
> 4 character hex code) for non-ASCII characters. If the resulting
> "ASCII-ized" string is both a)longer than 16 characters and b) greater
> than 1.5x times the length of the original string, we fall back onto
> the 16 byte MD5 hash as described in the main proposal.
I like Variation A, despite its slightly greater complexity.
Note that since we have a collision-resolution mechanism anyway, we
have some flexibility in our hash function. We needn't encode all
non-ASCII as 4 character hex codes; we could encode them in some more
friendly way. Specifically, for certain common non-ASCII characters
(e.g., most of those representable in ISO-8859-*), we could use an
encoding that reads more intuitively to a human:
foo/bör/baz.c ==> foo/bor/baz.c
or
foo/bör/baz.c ==> foo/boer/baz.c
This would make the lock tree more debuggable in common cases.
-Karl
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Dec 21 00:48:22 2004