[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1224836 - in /subversion/trunk: notes/knobs subversion/libsvn_fs_fs/fs_fs.c

From: Daniel Shahaf <danielsh_at_elego.de>
Date: Thu, 29 Mar 2012 21:13:52 +0200

stefan2_at_apache.org wrote on Mon, Dec 26, 2011 at 23:37:25 -0000:
> Author: stefan2
> Date: Mon Dec 26 23:37:25 2011
> New Revision: 1224836
>
> URL: http://svn.apache.org/viewvc?rev=1224836&view=rev
> Log:
> Tune FSFS deltification stratey: Use linear deltification on the very top of the
> deltification history and skip-delta only for larger distances. Most of the runtime
> overhead is being counterbalanced by reduced I/O and masked by our membuffer
> caching.
> OTOH, we examine skip delta ranges linearly during commit. Therefore, runtime
> becomes an issue when committing nodes with very deep change histories.
> This patch simply limits the deltification history to some reasonable length.
>
> * subversion/libsvn_fs_fs/fs_fs.c
> (SVN_FS_FS_MAX_LINEAR_DELTIFICATION,
> SVN_FS_FS_MAX_DELTIFICATION_WALK): new tuning parameters
> (choose_delta_base): implement the new deltification strategy
> * notes/knobs
> document the new defines
>
> Modified:
> subversion/trunk/notes/knobs
> subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c
>
> Modified: subversion/trunk/notes/knobs
> URL: http://svn.apache.org/viewvc/subversion/trunk/notes/knobs?rev=1224836&r1=1224835&r2=1224836&view=diff
> ==============================================================================
> --- subversion/trunk/notes/knobs (original)
> +++ subversion/trunk/notes/knobs Mon Dec 26 23:37:25 2011
> @@ -33,6 +33,8 @@ DEFAULT_HTTP_LIBRARY
> MAX_SECS_TO_LINGER
> SUFFIX_LINES_TO_KEEP
> SVN_FS_FS_DEFAULT_MAX_FILES_PER_DIR
> +SVN_FS_FS_MAX_LINEAR_DELTIFICATION
> +SVN_FS_FS_MAX_DELTIFICATION_WALK
> SVN_UNALIGNED_ACCESS_IS_OK
>
> 2.2 Features
> @@ -125,7 +127,25 @@ SVN_I_LIKE_LATENCY_SO_IGNORE_HTTPV2
> Range: natural integers
> Suggested: 1, 2, 3, 4, 5, 7, 11
>
> -3.6 SVN_UNALIGNED_ACCESS_IS_OK
> +3.6 SVN_FS_FS_MAX_LINEAR_DELTIFICATION
> +
> + Scope: libsvn_fs_fs
> + Purpose: max length + 1 of the linear deltification history
> + before skip-deltification kicks in
> + Default: 16
> + Range: natural integers
> + Suggested: 2, 4, 8, 16, 32, 64
> +
> +3.7 SVN_FS_FS_MAX_DELTIFICATION_WALK
> +
> + Scope: libsvn_fs_fs
> + Purpose: max skip deltification range. Change histories
> + longer than that will be restarted with a fulltext.
> + Default: 1023
> + Range: natural integers
> + Suggested: 1, 2, 3, 4, 5, 7, 11
> +

The suggested values (and Range) for both of these are inconsistent with
the comments at the definitions (which want a power of two, or a power
of two minus one).

> +3.8 SVN_UNALIGNED_ACCESS_IS_OK
>
> Scope: (global)
> Purpose: enable data accesss optimizations.
>
> Modified: subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c
> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c?rev=1224836&r1=1224835&r2=1224836&view=diff
> ==============================================================================
> --- subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c (original)
> +++ subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c Mon Dec 26 23:37:25 2011
> @@ -78,6 +78,20 @@
> #define SVN_FS_FS_DEFAULT_MAX_FILES_PER_DIR 1000
> #endif
>
> +/* Begin deltification after a node history exceeded this this limit.
> + Useful values are 4 to 64 with 16 being a good compromise between
> + computational overhead and pository size savings.
> + Should be a power of 2.
> + Values < 2 will result in standard skip-delta behavior. */
> +#define SVN_FS_FS_MAX_LINEAR_DELTIFICATION 16
> +
> +/* Finding a deltification base takes operations proportional to the
> + number of changes being skipped. To prevent exploding runtime
> + during commits, limit the deltification range to this value.
> + Should be a power of 2 minus one.
> + Values < 1 disable deltification. */
> +#define SVN_FS_FS_MAX_DELTIFICATION_WALK 1023
> +
> /* Following are defines that specify the textual elements of the
> native filesystem directories and revision files. */
>
> @@ -5186,6 +5200,7 @@ choose_delta_base(representation_t **rep
> apr_pool_t *pool)
> {
> int count;
> + int walk;
> node_revision_t *base;
>
> /* If we have no predecessors, then use the empty stream as a
> @@ -5203,6 +5218,23 @@ choose_delta_base(representation_t **rep
> count = noderev->predecessor_count;
> count = count & (count - 1);
>
> + /* We use skip delta for limiting the number of delta operations
> + along very long node histories. Close to HEAD however, we create
> + a linear history to minimize delta size. */
> + walk = noderev->predecessor_count - count;
> + if (walk < SVN_FS_FS_MAX_LINEAR_DELTIFICATION)
> + count = noderev->predecessor_count - 1;
> +

So, before this change the number of delta steps of the Nth revision was
equal the number of 1-bits in the binary representation of
predecessor_count, with this change it's equal to (predecessor_count
& 15) + (the number of 1-bits in the binary representation of
(predecessor_count >> 4)), right?

In other words, this change doesn't affect the high level structure of
the deltas tree (except it makes each leaf a 15-deltas chain), but it
trades disk space savings for more delta combinations in historical cats
(eg, revision 16n+8 will need X+8 deltas, instead of X+1.)

Are there scenarios in which this change is significant? What about
people who store large binaries that change frequently in svn, should we
be recommending them to set SVN_FS_FS_MAX_LINEAR_DELTIFICATION to
a small number? Should we be making these fsfs.conf settings, instead
of compile-time macros?

> + /* Finding the delta base over a very long distance can become extremely
> + expensive for very deep histories, possibly causing client timeouts etc.
> + OTOH, this is a rare operation and its gains are minimal. Lets simply
> + start deltification anew close every other 1000 changes or so. */
> + if (walk > SVN_FS_FS_MAX_DELTIFICATION_WALK)
> + {
> + *rep = NULL;
> + return SVN_NO_ERROR;
> + }
> +
> /* Walk back a number of predecessors equal to the difference
> between count and the original predecessor count. (For example,
> if noderev has ten predecessors and we want the eighth file rev,
>
>
Received on 2012-03-29 21:14:48 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.