Re: What makes merges slow in /trunk

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Thu, 30 May 2013 23:57:10 +0100 (BST)

Stefan Fuhrmann writes:

> Hi all,
>
> Since merge will be an important topic for 1.9, I ran a quick
> test to see what how we are doing for small projects like SVN.
> Please note that my local mirror of the ASF repo is about 2
> months behind (r1457326) - in case you want to verify my data.
>
>
> Summary:
>
> Merges can be very slow and might take hours or even days
> to complete for large projects. The client-side merge strategy
> inflates the load on both sides by at least a factor of 10 in the
> chosen scenario. Without addressing this issue, it will be hard
> to significantly improve merge performance in said case but
> addressing it requires a different strategy.

Thanks for doing a real experiment and measuring it.

> Test Scenario:
>
> A catch-up merge after various cherry picks.
> $svn co svn://server/apache/subversion/branches/1.7.x .
> $svn merge svn://server/apache/subversion/trunk . --accept working
>
>
> Findings:
>
> Server vs. client performance
>
> * With all caches being cold, both operations are limited by
> server-side I/O (duh!). FSFS-f7 is 3 .. 4 times faster and
> should be able to match the client speed within the next
> two months or so.
>
> * With OS file caches hot, merge is client-bound with the
> client using about 2x as much CPU as the server.
>
> * With server caches hot, the ratio becomes 10:1.
>
> * The client is not I/O bound (working copy on RAM disk).
>
>
> How slow is it?
> Here fastest run for merge with hot server caches. Please
> note that SVN is a mere 2k files project. Add two zeros for
> large projects:
> real    1m16.334s
> user    0m45.212s
> sys    0m17.956s
>
>
> Difference between "real" and "user"
>
> * Split roughly evenly between client and server / network
>
> * The individual client-side functions are relatively efficient
> or else the time spent in the client code would dwarf the
> OS and server / network contributions.
>
>
> Why is it slow?
>
> * We obviously request, transfer and process too much data,
> 800MB for a 45MB user data working copy:
[...]

That's a really embarrassing statistic!

> * A profile shows that most CPU time is either directly spent
> to process those 800MB (e.g. MD5) or well distributed over
> many "reasonable" functions like running status etc with
> each contributing 10% or less.
>
> * Root cause: we run merge 169 times, i.e. merge that many
> revision ranges and request ~7800 files from the server. That
> is not necessary for most files most of the time.
>
> Necessary strategic change:
>
> * We need to do the actual merging "space major" instead of
> "revision mayor".
>
> * Determine tree conflicts in advance. Depending on the
> conflict resolution scheme, set the upper revision for the whole
> merge to the conflicting revision
>
> * Determine per-node revision ranges to merge.

I agree, and I want to change our merge strategy to something roughly like this. It would give us not only faster raw performance, but also fewer conflicts and so
less manual intervention, because this would not break the merge of any
given file into more revision ranges than needed for that file [1].

> * Apply ranges ordered by their upper number, lowest one first.
>
> * In case of unresolvable conflicts, merge all other nodes up to
> the revision that caused the conflict.

I am not sure about this part. I think here you're trying to approximate the current "failure" mode when conflicts are encountered: that is, leave the WC in a state where all needed changes up to some uniform revision number have been merged across the whole tree. (At least the current mode is *something* like that.) While that is helpful for examining and understanding the state of project files, in order to resolve the conflict(s), it does break the nice idea of having each node merged in as few revision ranges as possible, and so it can introduce more conflicts. That seems to me to be a sufficiently strong indication that it's the wrong thing to do. We should try to find a better way here.

> If there have been no previous cherry pick merges, the above
> strategy should be roughly equivalent to what we do right now.

> In my test scenario, it should reduce the number of files requested
> from the server by a factor or 5 .. 10. Operations like log and
> status would need to be run only once, maybe twice. So, it
> seems reasonable to expect a 10-fold speedup on the client
> side and also a much lower server load.
>
> Fixing this scenario will drastically change the relative time
> spent for virtually all operations involved. So, before fixing it,
> it is not clear where local tuning should start.

Yup.

[1] I hope we're all familiar with the idea that, if you need to merge two successive changes (C1, C2) into a given file, then merging C1 and then C2 separately creates more likelihood of a conflict than if you merge just the single combined change (C1+C2) in one go. (I don't have a proof handy.)

- Julian
Received on 2013-05-31 00:59:10 CEST

This message: [ Message body ]
Next message: Hyrum K Wright: "Re: 1.8.0 release timing"
Previous message: Ben Reser: "Re: 1.8.0 release timing"
In reply to: Stefan Fuhrmann: "What makes merges slow in /trunk"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]