[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: How to improve search performance for moved directories and files?

From: Stefan Sperling <stsp_at_elego.de>
Date: Tue, 25 Feb 2020 13:24:00 +0100

On Tue, Feb 25, 2020 at 09:09:14AM +0100, Thorsten Schöning wrote:
> Guten Tag Daniel Shahaf,
> am Montag, 24. Februar 2020 um 18:27 schrieben Sie:
>
> > If the remote repository uses https://, you could set up mod_dav_svn on
> > localhost in a proxy configuration. For svn:// the equivalent would be
> > to set up an svnsync mirror and do «svn relocate»'s to and from it by
> > hand.[...]
>
> Thanks for the suggestions, but I can't expect my coworkers to do
> that. Some of them would simply prefer discussing if to keep using
> SVN at all in favour of ... we all know what. ;-)
>
> I'm regularly getting the SVN-repos from the remote host using RSYNC
> locally anyway. So while not as correct as using svnsync in theory, I
> can simply do a 2 URL-merge using unrelated file-URIs with my local
> backups of the repos. That at least saves me from the relocate. The
> only thing I'm missing this way is merge tracking and merge recording.
> At least the latter can can be done after merging using the remote
> target again by telling the SVN-client to record the merge only. That
> is fast enough as no conflicts are triggered at all.
>
> Two additional questions:
>
> 1. Why does the number of revisions seem to matter that much?
>
> This kind of merge conflict seems to become slower and slower as the
> number of revisions increases, even if all of those commits belong to
> totally unrelated branches. Additionally, the commits moving the
> directories and triggering the conflicts are not that far in the
> past, only very few hundreds of commits.
>
> Something like the following: 100 auto-commits in branchA, very few
> commits moving directories in branchB, 100 auto-commits in branchA
> again. I would have expected the SVN-client focussing on branchB and
> finding the possible move targets in that branch pretty early.
>
> 2. Really no other handbrake somewhere?
>
> When doing the merge locally, I have a very high CPU-usage, but very
> little I/O, like constantly something around 40 Kbit/s. That doesn't
> matter locally especially in case using a SSD of course, but does
> remotely because of the additional latencies I guess.
>
> So, is that simply how things work? Lots of small reads in those
> cases introducing lots of latency slowing things down heavily? And
> that can't be easily optimized further by e.g. any setting of the
> SVN-client?

The primary goal of the conflict resolver is not to be fast.

Consider the situation we had before the conflict resolver existed:
Each and every conflict had to be analyzed and resolved by a human, and
it was very easy to make mistakes. This cost literally hours and hours
of human time everywhere SVN was deployed.

The human conflict resolving timeframe is what the design of the conflict
resolver was up against.
The goal was to reduce these hours spent on resolving tree conflicts over
and over to a couple of minutes. The resolver tries to be accurate in its
detection of conflicts, provide sufficient flexibility when resolving
conflicts, and is also designed to be extensible (if there is a conflict
case that is not covered yet but should be, all that needs to be done is
adding about 3 functions, written in C code).

Another constraint is that the resolver should be able to work against old
SVN servers, since clients are more regularly updated to new releases than
already deployed servers. This means the resolver needs to do round-trips.
As it discovers information it keeps going back to the server until it has a
complete picture of the conflict situation. The server has no idea what the
client is really asking it for.

If you're unhappy with the result, I would suggest you become involved in
improving the implementation yourself. There should be room for improvement,
especially if the server was made smarter.

A situation with high latency tunnelling is naturally very hard to improve
with a client<->server roundtrip-heavy design.
For best performance you really want your SVN server on the LAN.
Received on 2020-02-25 13:24:14 CET

This is an archived mail posted to the Subversion Users mailing list.