Re: Issue with order of svn:mergeinfo revision ranges vs. svnmerge-migrate-history.py..

From: Paul Burba <ptburba_at_gmail.com>
Date: Tue, 27 Jan 2009 17:48:48 -0500

On Mon, Jan 26, 2009 at 2:19 PM, Paul Burba <ptburba_at_gmail.com> wrote:
> On Fri, Jan 23, 2009 at 4:17 PM, C. Michael Pilato <cmpilato_at_collab.net> wrote:
>> Gunter Woytowitz wrote:
>>> Paul,
>>>
>>> Your suspicion is correct, there was a "broken" history, here's a snippet
>>> from the logs where the copy occurred.
>>>
>>> ---------------------------------------------------- r49990 | asdf |
>>> 2006-10-12 08:38:59 -0400 (Thu, 12 Oct 2006) | 1 line Changed paths: A
>>> /branches/branch1 (from /branches/branch1:49987)
>>> ----------------------------------------------------
>>>
>>> The branch was accidentally deleted, and then restored a few revs later.
>>>
>>> The svnmerge-integrated property before running the script was
>>> /branches/branch1:1-53669
>>
>> Yes, svnmerge.py is pretty much history-ignorant. (An interesting feature
>> for a merge-tracking tool, when you really think about it.) That ignorance
>> is exactly the kind of thing that the migration script is trying to undo.
>> This comment in the script describes that:
>>
>> # Unfortunately, svnmerge.py tends to initialize using oft-bogus
>> # revision ranges like 1-SOMETHING when the merge source didn't
>> # even exist in r1. So if the natural history of a branch
>> # begins in some revision other than r1, there's still going to
>> # be cruft revisions left in NEW_MERGEINFO after subtracting the
>> # natural history. So, we also examine the natural history of
>> # the merge sources, and use that as a filter for the explicit
>> # mergeinfo we've calculated so far.
>>
>>
>> Unfortunately, there are other comments in the script about how situations
>> like the one you have are likely to fool the script.
>>
>> ### If by some chance it is the case that /path:RANGE1 and
>> ### /path:RANGE2 a) represent different lines of history, and
>> ### b) were combined into /path:RANGE1+RANGE2 (due to the
>> ### ranges being contiguous), we'll foul this up. But the
>> ### chances are preeeeeeeetty slim.
>>
>> We might be able to teach the script to handle this better by taking a
>> different approach. Rather than checking each individual range in the
>> proposed mergeinfo against the natural history of the branch individually,
>> we could fetch the entire natural history of the branch at once (bounded by
>> the max and min merges, maybe) and take the union of that output with the
>> proposed mergeinfo. Not sure why I didn't take that approach before ...
>> possible concerns about performance.
>
> These are all valid points. However, I still can't see how the
> migration scripts can produce *overlapping* ranges in the mergeinfo;
> unordered ranges, sure, I can replicate that pre-r35358, but
> overlapping? I'm stumped how this could occur today or with any
> version of the scripts going back to r35357. I originally raised the
> question about gaps in the history of '/branches/branch1' because this
> could lead to reverse-ordered mergeinfo (pre-r35358) and I thought
> that that might have somehow led to the overlapping mergeinfo, but I
> got nowhere with this.
>
> As I mentioned elsewhere in this thread, in r35466 I made a change to
> svn_mergeinfo_parse() to solve Gunter's immediate problem. But I
> would still like to figure out how the script produced the overlapping
> mergeinfo in the first place...So, Gunter, would it be at all possible
> to get a dump of your repository so I can run the scripts myself and
> see what is wrong? I completely understand if this is out of the
> question, but I'm running low on other ideas at the moment :-(
>
>>> My general thought is that there should be no way for the migrate script
>>> to commit an invalid svn:mergeinfo property to the repository. I expect
>>> there to be bugs in 3rd party scripts, especially with a new complex
>>> feature, but I also expect the subversion code should be a bit more
>>> robust and protect itself from this by blocking any commits that contain
>>> invalid svn:mergeinfo properties. It seems like there is a lot of error
>>> checking of the svn:mergeinfo property when "reading" the data (ie. the
>>> errors I'm seeing), but not when the property was "written" to the
>>> repository. Are there svn:mergeinfo property checks in the subversion
>>> code on commits? If yes, how did the migrate script bypass them? If no,
>>> then the checks should be added.
>>
>> The migration script you ran makes directory property changes to the
>> repository's filesystem layer. It is *absolutely incorrect* for that layer
>> to be validating property value formats. The migration script needs,
>> therefore, to perform that validation itself, perhaps by simply
>> parsing-and-unparsing the values it proposes to write to the filesystem
>> (which would cause the mergeinfo to pass through the utility functions that
>> perform format-checking).
>
> If there are no objections I'll commit the attached patch that does just that:
>
> [[[
> Prevent svnmerge-migrate-history.py from committing non-canonical mergeinfo.
>
> See issue #3302.
>
> * contrib/client-side/svnmerge/svnmerge-migrate-history.py
> (Migrator.convert_path_history): Run any new svn:mergeinfo through
> svn_mergeinfo_parse() before committing it, ensuring that it is in
> canonical form.
> ]]]

Done r35516.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1060625
Received on 2009-01-27 23:52:38 CET

This message: [ Message body ]
Next message: Hyrum K. Wright: "[WC-NG] Writing svn_wc_entry_t to sqlite"
Previous message: Warren Jones: "Re: Can't commit multiple externals with depth infinity"
In reply to: Paul Burba: "Re: Issue with order of svn:mergeinfo revision ranges vs. svnmerge-migrate-history.py.."
Next in thread: Paul Burba: "Re: Issue with order of svn:mergeinfo revision ranges vs. svnmerge-migrate-history.py.."

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]