On Mon, Jul 30, 2012 at 6:47 PM, Philip Martin
<philip.martin_at_wandisco.com>wrote:
> When writing an FSFS revision file some parts are written in hash order
> and so with a recent APR the order is not predictable. Thus loading the
> same dumpfile into two separate empty repositories produces different
> revision files. The things that vary are the order of the nodes in the
> file and the order of the change lines at the end of the file.
>
> As with the dumpfile format we have never formally specified the order
> of a revision file so the randomn order is not strictly a bug but it
> might be useful if the order was at least repeatable.
>
> Things get more complicated in 1.8 as a side-effect of directory
> deltification. Directory deltification works best if the directory
> order is stable and so some hashes now use a non-APR hash function to
> produce a stable order. Whether or not the revision file is repeatable
> depends on which hashes are used. The change lines hash is still
> unstable but the hash returned by svn_fs_fs__rep_contents_dir can be
> stable or unstable depending on whether or not the has was found in the
> cache or created on demand. It makes me even more uneasy that how much
> variation is present in the revision file depends on our caching
> strategy.
>
> I'm considering changing the commit code so that hashes are written in a
> stable order and the revision files are repeatable. Does anyone think
> this would be a bad idea?
>
+1 but I haven't done an in-depth review of the patch.
Reproducible revision file content is nice. The runtime overhead
should be dwarfed by the other computations (delta, checksum).
-- Stefan^2.
--
Certified & Supported Apache Subversion Downloads:
http://www.wandisco.com/subversion/download
Received on 2012-07-31 10:25:56 CEST