Re: Input needed in solving issue 2897.

From: Karl Fogel <kfogel_at_red-bean.com>
Date: 2007-11-01 23:33:52 CET

"Kamesh Jayachandran" <kamesh@collab.net> writes:
>>So my question is, immediately after rZ, does the svn:mergeinfo on /fb
>>accurately reflect exactly what has been merged in from elsewhere? If
>>not, is there a good reason why not?
>
> Good question!. Mergeinfo on '/fb' is accurate only with respect to
> '/fb'. When someone wants to know what has been merged from '/trunk'
> to '/fb', he would get the correct answer say rX:Y. One should use
> this mergerange(rX:Y) to avoid repeated merge only if the merge is
> from '/trunk' not '/fb'. To avoid repeat merge from '/fb' one should
> use rZ.
>
> Mergeinfo === 'merge source' + 'merge range' doing a math only on
> 'merge range'(ignoring the merge source) is like comparing apples
> and oranges.

Yes, agreed -- I completely see what you mean.

(Slowly, slowly I'm beginning to understand our merge tracking
implementation.)

>>> In this particular case r12 merge would appear 3 times, which record
>>> to consider to decide when the merge has occured?
>
>>Why does r12 appear 3 times? It was only merged once.
>
> That is the way mergeinfo is stored in sqlite database. Whenever any
> node in a commit has mergeinfo it is stored as multiple merge
> range+path records. That is full text of 'svn:mergeinfo' is divided
> in to multiple merges and stored as individual record.
>
> Let us say you have '/trunk:12' on '/fb', It puts one record
> ('/trunk', '/fb', 11, 12, 1)
>
> Now next commit has '/trunk:12,18' on '/fb', It puts two records
> ('/trunk', '/fb', 11, 12, 1) ('/trunk', '/fb', 17, 18, 1) one for
> each merge range+path record.
>
> Now next commit has '/trunk:12-13,18' on '/fb', It puts two records
> ('/trunk', '/fb', 11, 13, 1) ('/trunk', '/fb', 17, 18, 1) one for
> each merge range+path record.

Thanks.

I'm looking at the table definitions in mergeinfo-sqlite-index.c and
trying map the above to what I see there. Let's take your first commit:

> Let us say you have '/trunk:12' on '/fb', It puts one record
> ('/trunk', '/fb', 11, 12, 1)

That produces a row like this in mergeinfo_changed:

revision == 50 /* revision in which r12 was merged from trunk to fb */
path == '/fb'

And a row like this in 'mergeinfo':

   revision == 50
   mergedfrom == '/trunk'
   mergedto == '/fb'
   mergedrevstart == 11
   mergedrevend == 12
   inheritable == 1

Is that right so far?

Now when we go to the next commit...

> Now next commit has '/trunk:12,18' on '/fb', It puts two records
> ('/trunk', '/fb', 11, 12, 1) ('/trunk', '/fb', 17, 18, 1) one for
> each merge range+path record.

That produces a row like this in mergeinfo_changed:

revision == 56 /* revision in which r18 was merged from trunk to fb */
path == '/fb'

And a row like this in 'mergeinfo':

   revision == 56
   mergedfrom == '/trunk'
   mergedto == '/fb'
   mergedrevstart == 17
   mergedrevend == 18
   inheritable == 1

Now, my question is, do we also put in a new row for the r12
mergeinfo? That is, do we add a new row saying

(56, '/trunk', '/fb', 11, 12, 1)

even though we already have a row from before that says

(50, '/trunk', '/fb', 11, 12, 1)

Okay, moving on to the third commit:

> Now next commit has '/trunk:12-13,18' on '/fb', It puts two records
> ('/trunk', '/fb', 11, 13, 1) ('/trunk', '/fb', 17, 18, 1) one for
> each merge range+path record.

(Let's say that happens in r60.) A similar questions applies: when we
add the record...

(60, '/trunk', '/fb', 11, 13, 1)

...do we also *remove* any records like this:

(..., '/trunk', '/fb', 11, 12, 1)

since they are sort of subsets of the new record? Or do we let the
old records remain? I realize that doing such removals might be
computationally intensive, so I'm really asking two questions: do we
remove such subset records, and if not, do we at least *wish* we could
remove them?

>>> Take the least revision(commit revision)?, No What if we reverted r12 from
>>> '/trunk' to '/fb' on r51, Remember we don't record reverse merges and
>>> merge back the same on r52.
>
>>I don't understand what the phrase "reverted r12 from '/trunk' to
>>'/fb' on r51" means. Do you mean remove r12 from /trunk and put it
>>into /fb, all in one commit? (But it is already in /fb...)
>
> Sorry for being unclear, I meant 'reverse merge r12 on '/fb' from
> '/trunk' and commit in r51, So this would remove 'r12' from /fb's
> mergeinfo. So sqlite record won't have 'r12' anymore. Now again you
> merge r12 to '/fb' from '/trunk', Now sqlite will have r12.

Thanks, now I understand.

>>Could you show the precise structure of the proposed new table?
>
> Current schema of 'mergeinfo_changed'
> -------------------------------------
> CREATE TABLE mergeinfo_changed (revision INTEGER NOT NULL,
> path TEXT NOT NULL);
>
> revision = subsequent commit to merge.
> path = merge target.
>
> My proposed schema
> ------------------------------
> CREATE TABLE mergeinfo_changed (revision INTEGER NOT NULL,
> mergedfrom TEXT NOT NULL, mergedto TEXT NOT NULL, mergedrevstart
> INTEGER NOT NULL, mergedrevend INTEGER NOT NULL, inheritable INTEGER
> NOT NULL);
>
> *exactly same* as that of the only other table in our sqlite db 'mergeinfo'.
>
> I think coulumn names are more obvious here.

*nod* Okay, understood.

> The difference is in the way we are going to store the records in
> 'mergeinfo_changed'.
>
> 'mergeinfo' table as ususal will have the records for all the merges
> on a node on every single merge+commit irrespective of what exactly
> been merged in this commit.(Remember 3 records for r12 in my
> original mail)

Thanks, that helps clarify things for me.

Let me test my understanding. The way to find out (using the current
code) what exactly got merged in, say, r60 is to look in
mergeinfo_changed and select on rev==60. Suppose we get these
results:

   (60, '/some/path')
   (60, '/some/other/path')
   (60, '/yet/another/path')

Okay, let's start with '/some/path'. We search backwards in that
table for the previous change to '/some/path'. Say we find this:

(56, '/some/path')

Great. Now go to the mergeinfo table, and select every row where

(rev == 60) && (mergedto == '/some/path')

Then get all the mergeinfo for

(rev == 56) && (mergedto == '/some/path')

Compute the "difference" between the two mergeinfo sets, and that is
what was merged to '/some/path' in r60. Repeat for '/some/other/path'
and '/yet/another/path', of course.

Is my understanding even close? This is all guessing, based on the
table schemas and on thinking about the problem. I might be
completely wrong, please don't hesitate to correct me :-). If I am
wrong, then I don't understand the way those two tables are used
together in today's code.

> 'mergeinfo_changed' table will only record the merge revision ranges
> pertaining to a given commit, that way we can precisely identify the
> 'commit_revs'(a.k.a reflective merge revision rZ in my origial mail)
> for a merge.

This sounds sane to me, but I think I need to hear your response to
the above before I can be sure.

Thanks for your patience while I learn this code, Kamesh.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Nov 1 23:34:03 2007

This message: [ Message body ]
Next message: Daniel L. Rall: "Re: svnadmin load and mergeinfo"
Previous message: Blair Zajac: "Re: svn commit: propchange - r2092 - svn:log"
Next in thread: Kamesh Jayachandran: "Re: Input needed in solving issue 2897."
Reply: Kamesh Jayachandran: "Re: Input needed in solving issue 2897."

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]