On Fri 2003-03-07 at 15:52:59 -0800, Greg Stein wrote:
[...]
> Let's say that you isolated the branch creation down to two source
> revisions:
>
> (A) rev 103, picking up 1000 items
> (B) rev 107, picking up 10 items
>
> Let's also say that /trunk/some/dir is a directory with 11 items in it --
> one item from bucket (A) (named "fname0") and the other ten ("fname1"
> through "fname10") are all (B). The algorithm would perform the following
> operations:
>
> svn cp /trunk@103 /branches/BRANCH
> svn cp /trunk/some/dir/fname1@107 /branches/BRANCH/some/dir
> ...
> svn cp /trunk/some/dir/fname10@107
>
> But the ideal behavior is:
>
> svn cp /trunk@103 /branches/BRANCH
> svn cp /trunk/some/dir@107 /branches/BRANCH/some/dir
> svn cp /trunk/some/dir/fname0@103 /branches/BRANCH/some/dir
>
> Beats the crap out of me how to do *that* algorithm :-), but I wanted to
> describe the scenario so that we can document the potential occurrence.
Probably I am missing the point, or more precisely, the greater
picture (wouldn't be the first time ;), but what is wrong with going
up the path bottom-up and building majorities this way. Example:
Look at longest path, here .../some/dir, and group by revisions:
10 from /trunk/some/dir@107
1 from /trunk/some/dir@103
-> choose /trunk/some/dir@107 as main source
Drop trailing dir, look at .../some, and group by revisions:
(let's pretend that it only has one item)
1 from /trunk/some@107 (the /trunk/some/dir@107 we just choose)
-> choose /trunk/some@107
Drop trailing dir, look at .../, and group by revisions:
990 from /trunk@103 (or whatever is left of the 1000)
1 from /trunk@107 (the /trunk/some@107 we just choose)
-> choose /trunk@103 as main source
and so on.
Some further comments:
- As you can see, having a lot of sub-directories from a certain
revision would make the parent to be from the same revision.
- Of course, you would only generate an "svn cp" for a sub-directory
if a different revision had been chosen for the revision, i.e. you
don't want to do step #2 of
1 svn cp /trunk@103 /branches/BRANCH
2 svn cp /trunk/some@103 /branches/BRANCH/some
3 svn cp /trunk/some/dir@107 /branches/BRANCH/some/dir
- Items being in the trunk (for the given revision), but not in the
branch could/should to be counted in negatively for the belonging
group. If sums for all groups are negative, choose an empty dir as
"main source" (by using "svn mkdir") proceed for all sub-items as
usual. Example:
If you there are 20 other items in each of both,
/trunk/some/dir@107 and /trunk/some/dir@103, you would get:
Look at longest path, here .../some/dir, and group by revisions:
-10 from /trunk/some/dir@107 (result of 10-20)
-19 from /trunk/some/dir@103 (result of 1-20)
-> choose empty dir as main source (via "svn mkdir")
- In other words, the algorithm effectively counts "number of
commands to be saved" (off by one).
- The algorithm presumes that the "cost" is linear to the "number of
commands" and the "cost" for all commands (cp, mkdir, rm) is the
same. Tweaking for different costs is easy. Tweaking for
non-linearity isn't.
- One should consider using some threshold. If there are 20 different
revisions from each of which we can copy 1 file and 1 revision
(say, C) we can copy 2 files from, we probably don't want start
with C (which has max. count) as main source, but with an empty
dir. In other words, the main source we choose probably should
contain some (relative?) majority of the items.
Well, I hope I did not bore anyone to death. I choose to be more
explicit as I often tend to explain things in a way which other find
non-intuitive (and my technical writing skills for English don't help
at that).
Bye,
Benjamin.
- application/pgp-signature attachment: stored
Received on Sat Mar 8 02:23:04 2003