On Fri, Jan 23, 2009 at 4:17 PM, C. Michael Pilato <cmpilato_at_collab.net> wrote:
> Gunter Woytowitz wrote:
>> Your suspicion is correct, there was a "broken" history, here's a snippet
>> from the logs where the copy occurred.
>> ---------------------------------------------------- r49990 | asdf |
>> 2006-10-12 08:38:59 -0400 (Thu, 12 Oct 2006) | 1 line Changed paths: A
>> /branches/branch1 (from /branches/branch1:49987)
>> The branch was accidentally deleted, and then restored a few revs later.
>> The svnmerge-integrated property before running the script was
> Yes, svnmerge.py is pretty much history-ignorant. (An interesting feature
> for a merge-tracking tool, when you really think about it.) That ignorance
> is exactly the kind of thing that the migration script is trying to undo.
> This comment in the script describes that:
> # Unfortunately, svnmerge.py tends to initialize using oft-bogus
> # revision ranges like 1-SOMETHING when the merge source didn't
> # even exist in r1. So if the natural history of a branch
> # begins in some revision other than r1, there's still going to
> # be cruft revisions left in NEW_MERGEINFO after subtracting the
> # natural history. So, we also examine the natural history of
> # the merge sources, and use that as a filter for the explicit
> # mergeinfo we've calculated so far.
> Unfortunately, there are other comments in the script about how situations
> like the one you have are likely to fool the script.
> ### If by some chance it is the case that /path:RANGE1 and
> ### /path:RANGE2 a) represent different lines of history, and
> ### b) were combined into /path:RANGE1+RANGE2 (due to the
> ### ranges being contiguous), we'll foul this up. But the
> ### chances are preeeeeeeetty slim.
> We might be able to teach the script to handle this better by taking a
> different approach. Rather than checking each individual range in the
> proposed mergeinfo against the natural history of the branch individually,
> we could fetch the entire natural history of the branch at once (bounded by
> the max and min merges, maybe) and take the union of that output with the
> proposed mergeinfo. Not sure why I didn't take that approach before ...
> possible concerns about performance.
These are all valid points. However, I still can't see how the
migration scripts can produce *overlapping* ranges in the mergeinfo;
unordered ranges, sure, I can replicate that pre-r35358, but
overlapping? I'm stumped how this could occur today or with any
version of the scripts going back to r35357. I originally raised the
question about gaps in the history of '/branches/branch1' because this
could lead to reverse-ordered mergeinfo (pre-r35358) and I thought
that that might have somehow led to the overlapping mergeinfo, but I
got nowhere with this.
As I mentioned elsewhere in this thread, in r35466 I made a change to
svn_mergeinfo_parse() to solve Gunter's immediate problem. But I
would still like to figure out how the script produced the overlapping
mergeinfo in the first place...So, Gunter, would it be at all possible
to get a dump of your repository so I can run the scripts myself and
see what is wrong? I completely understand if this is out of the
question, but I'm running low on other ideas at the moment :-(
>> My general thought is that there should be no way for the migrate script
>> to commit an invalid svn:mergeinfo property to the repository. I expect
>> there to be bugs in 3rd party scripts, especially with a new complex
>> feature, but I also expect the subversion code should be a bit more
>> robust and protect itself from this by blocking any commits that contain
>> invalid svn:mergeinfo properties. It seems like there is a lot of error
>> checking of the svn:mergeinfo property when "reading" the data (ie. the
>> errors I'm seeing), but not when the property was "written" to the
>> repository. Are there svn:mergeinfo property checks in the subversion
>> code on commits? If yes, how did the migrate script bypass them? If no,
>> then the checks should be added.
> The migration script you ran makes directory property changes to the
> repository's filesystem layer. It is *absolutely incorrect* for that layer
> to be validating property value formats. The migration script needs,
> therefore, to perform that validation itself, perhaps by simply
> parsing-and-unparsing the values it proposes to write to the filesystem
> (which would cause the mergeinfo to pass through the utility functions that
> perform format-checking).
If there are no objections I'll commit the attached patch that does just that:
Prevent svnmerge-migrate-history.py from committing non-canonical mergeinfo.
See issue #3302.
(Migrator.convert_path_history): Run any new svn:mergeinfo through
svn_mergeinfo_parse() before committing it, ensuring that it is in
> You would not run into this problem if you used, instead, the remote version
> of this script, which has to pass its mergeinfo through the Subversion
> client layer (and all that sanity-checking logic).
> C. Michael Pilato <cmpilato_at_collab.net>
> CollabNet <> www.collab.net <> Distributed Development On Demand
Received on 2009-01-26 20:20:49 CET