[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: inconsistency between mergeinfo records

From: Stefan Hett <stefan_at_egosoft.com>
Date: Thu, 25 Jun 2015 15:29:56 +0200

Hi,

as promised, answering the remaining questions now:
> [...]
>
> If you have any time requirements/considerations on your side
> which would require/benefit from earlier feedback, pls let me know.
>
>
> Right now, we are all working towards the 1.9 RC. Feedback
> in May or June would be nice.
>
> The key question that I like to see answered is "Does the
> tool do something useful?" For instance, it might become
> ineffective in complex setups, we might need to add detection
> of "mismatched" branches etc. We might also end up with
> mergeinfo that is technically smaller but neither faster to
> process nor easier to understand.
Overall I think this is a really great tool and is really valuable to
administrators who have been running larger instances over a longer
period of time.

Initially the output of the analysis-log is kinda bloated. In my initial
run the output produces a 2MB log-file. After reducing the amount of
mergeinfo records (using normalization and dropping merginfos from
obsolete branches) the output is quite good/reasonable. Some kind of
documentation explaining the different output statements mean and what
the admin/user could do about it would be helpful though I think.

Also it'd be good to add a more automated "one-step" command to simplify
the usage even further. So a user/admin could simply start the tool (for
instance svn-mergeinfo-normalizer clean-up-mergeinfo [path]
-drop-obsolete-branches) which would more or less equal running the tool
several times in the following sequence:
svn-mergeinfo-normalizer.exe clear-obsoletes [path]
svn-mergeinfo-normalizer.exe normalize [path]
svn-mergeinfo-normalizer.exe combine-ranges [path]
svn-mergeinfo-normalizer.exe analyse [path] -stats

(where I'd envision the -stats param for the analyse command would print
out a summary of how many remaining mergeinfos could not be normalized
(if any) and pointing the user to run the full analysis step to get a
more detailed output).

For the long term I hope that the functionality provided by this tool
would become obsolete and the issues for which you have to use this tool
are dealt with directly in the SVN core so these would not surface at
all anymore (aka: no need to normalize mergeinfos manually).

> So, there are the things that I'd love to get some feedback on:
>
> * Does the tool work at all (no crashes, nothing obviously stupid)?
I experienced no crashes and the output was quite clear to me (after
facing the initial quite bloated analysis output ).
> * Is the result of each reduction stage correct (as far as one can tell)?
Already pointed out a few cases in my other replies. Will start a new
thread to keep this with the further remaining cases I think I found.
> * Is the tool feedback intelligible? How could that be improved?
As suggested above some means to get a more statistical output
especially for the initial run might be helpful. The header information
atm is already a good start, but maybe adding/cleaning-up the output a
bit further to produce maybe some statistic log would be more useful for
the first run.

For instance atm the analysis-output reports the actual non-existing
branches for each path the tool checks-out. In my case that's around 100
branches for each of the 400 paths... -> over 40.000 lines of branch
info. More useful would be a list at the top with branches being
obsolete (it's implicit that all subdirectories into the branch is
obsolete if the parent path is non-existand).

With the added reporting of obsolete branches this is even worse now.

The other thing might be to add some stat-output to normalize /
combine-ranges / clear-obsoletes to report how many mergeinfo entries
could be normalized, or how many obsolete paths were removed.
Since the commands can take a few minutes to run, some kind of "progress
output" might also be useful, so the user knows the process did not
deadlock or ran into an endless loop.
> * How effective is each stage / mergeinfo reduction command?
> * How often does it completely elide sub-tree mergeinfo?
> * What typical scenarios prevented sub-tree mergeinfo elision?
I guess this was already answered by sending you the log files.
> Up to here, you don't need to commit anything. If you are
> convinced that the tool works correctly, you may commit
> the results into some toy copy of your repository. Then the
> following would be interesting:
>
> * Are merges based on the reduced mergeinfo faster?
> * Do merges based on the reduced mergeinfo use less memory?
> * Any anomalies?
>
I didn't spot any anomalies so far. With regards on performance and
memory consumptions I can't provide any numbers. One common use-case
which is now significantly faster though is to merge changes from one to
the other branch, since it now only contains a few nodes with mergeinfos
while before it had to commit up to 400 nodes changes... So this to us
is a really significant improvement.

Regards,
Stefan
Received on 2015-06-25 15:30:08 CEST

This is an archived mail posted to the Subversion Dev mailing list.