[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: inconsistency between mergeinfo records

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Wed, 24 Jun 2015 12:38:34 +0200

On Wed, Jun 24, 2015 at 10:21 AM, Stefan Hett <stefan_at_egosoft.com> wrote:

> Hi Stefan^2,
> Hi Stefan,
>> If you have a working build environment for Subversion,
>> you might have a look at this branch:
>> https://svn.apache.org/repos/asf/subversion/branches/svn-mergeinfo-normalizer
>> It provides a new tool that you might find useful:
>> ./tools/client-side/svn-mergeinfo-normalizer/svn-mergeinfo-normalizer
>> which allows you to analyse and reduce the mergeinfo in a working copy.
>> It also tells you which mergeinfo cannot be elided and _why_.
>> svn-mergeinfo-normalizer analyse /path/to/working/copy
>> svn-mergeinfo-normalizer normalize /path/to/working/copy
>> svn-mergeinfo-normalizer analyse /path/to/working/copy
>> svn-mergeinfo-normalizer clear-obsoletes /path/to/working/copy
>> svn-mergeinfo-normalizer analyse /path/to/working/copy
>> svn-mergeinfo-normalizer combine-ranges /path/to/working/copy
>> svn-mergeinfo-normalizer analyse /path/to/working/copy
>> CAVEAT: This tool has not been reviewed and thoroughly tested.
>> You should only commit changes that you have verified to be correct.
>> Please let us know what your results were.
> [...]
> I gave the normalizer a first run and using it I was able to reduce our
> record of mergeinfos quite significantly without too much manual work. So
> alltogether I'd consider this a really useful tool.
> I sent you the logs per PM so you can take a look at all the details
> yourself (some of the information contained in the logs I do not want to
> make publicly available on the mailing list and it's quite some large logs
> (several MB)).

Those logs are really helpful. I've already found a number of
things that should be improved:

* Branches should always be sorted by name
* 'analyze' should also check whether obsolete branches can be elided.
  The 'normalize' step already does this.

* 'analyze' should have a summary of all obsolete branches,
  including their last change rev/timestamp and deletion rev/timestamp.
* There should be an new sub-command to remove a list of branches
  from the mergeinfo. That gives users better control over this quite
  destructive operation (throwing away mergeinfo as oposed to just
  normalizing it).

> Note that running normalize at the beginning, could not remove any
> redundant mergeinfos at all, since there were always some revisions
> missing. However running the following sequence of commands first:
> 1. clear-obsoletes
> 2. combine-ranges
> 3. normalize
> Then was able to remove a significant amount of redundant mergeinfos
> (eliminating mergeinfos on over 100 files).

I guess that was due to 'clear-obsoletes' removing lots of old merges
from old branches where sub-tree mergeinfo could indeed not be elided.
Once the extended analyze output is available as described above,
you should be able to verify that hypothesis. That said, removing
obsolete branches first is certainly a very efficient workflow.

In your case, 'combine-ranges' has been surprisingly effective in reducing
the size of the mergeinfo representation (number of revision ranges is
down by >60%). As expected, it did not change the semantic contents.
IOW, has no effect on which sub-tree mergeinfo can be elided.

There were still some remaining records. These I managed to get rid of by
> manually merging the missing ranges.

There are also a few genuine sub-tree merges that look like a sync
with some vendor branch. Some might be replaced by svn:externals -
which comes with a few restrictions as well as benefits.

> One remaining case which couldn't be normalized automatically was on
> "/src/version_generator".
> Revision 190854 was recorded on root and on src but not on
> src/version_generator.
> 190854 was actually the creation of a branch (XRebirth/branches/XR_ogl).

Could you send me the output of 'svn log -v --xml -r190854' ?
That will tell me in detail what got changed.

The tool basically has the same information as that log output
and could certainly try hard to detect this "edge case".

So I guess (in theory) the normalizer could have handled that missing range
> as an irrelevant revision for eliding the remaining branches. Wouldn't it
> be possible/useful to handle that by the tool automatically?

My speculation is that r190854 also touched node properties or
so and SVN then decided to not include that creation into the
sub-tree mergeinfo. Could also be / have been a quirk in the
client code.

> Regarding your other questions I'll get back to you later.

Thank you very much!

-- Stefan^2.
Received on 2015-06-24 12:38:47 CEST

This is an archived mail posted to the Subversion Dev mailing list.