[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1813898 - in /subversion/trunk/subversion: libsvn_fs_fs/transaction.c libsvn_repos/reporter.c tests/cmdline/basic_tests.py tests/cmdline/svnadmin_tests.py

From: Evgeny Kotkov <evgeny.kotkov_at_visualsvn.com>
Date: Thu, 23 Nov 2017 00:08:38 +0300

Stefan Sperling <stsp_at_apache.org> writes:

> However, if rep-sharing is enabled, svn_fs_props_changed() does not work
> as advertised because properties do not carry a SHA1 checksum with a
> "uniquifier" which identifies the transaction they were created in.
> The uniquifier is used by svn_fs_props_changed() to tell apart property
> representations which share content but were created in different revisions.
>
> To fix that problem, make FSFS write SHA1 checksums along with uniquifiers
> for file properties, just as it is already done for file content.
>
> A source code comment indicates that SHA1 checksums for property reps
> were not written due to concerns over disk space. In hindsight, this was
> a bad trade-off because it affected correctness of svn_fs_props_changed().

Thinking about this change, it could be that writing the additional 40-byte
SHA1 for the property representations is going to eliminate the benefit of
sharing them in the first place.

If I recall correctly, rep sharing for properties is mostly there to gain
from deduplicating small properties, such as svn:eol-style or svn:keywords.
But with the additional SHA1 written in every representation string, this
overhead is likely to take more space than the property itself.

An alternative approach that might be worth considering here would be:

 (1) Extend the on-disk format and allow representation strings without
     SHA1, but with the uniquifier, something like this (where "-" stands
     for "no SHA1"):

     15 0 563 7809 28ef320a82e7bd11eebdf3502d69e608 - 14-g/_5

     (The new format would be allowed starting from FSFS 8.)

 (2) Use the new format to allow rep sharing for properties that writes
     the uniquifier so that svn_fs_props_changed() would work correctly,
     and doesn't introduce the overhead of writing SHA1 in the representation
     string for every property.

 (3) Disable rep sharing for properties in FSFS formats < 8 that cannot
     read and write such representation strings without SHA1, but with an
     uniquifier.

Barring objections and alternative suggestions, I could give a shot at
implementing this.

Regards,
Evgeny Kotkov
Received on 2017-11-22 22:09:10 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.