[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: 'svnadmin load' doesn't deltify enough.

From: Max Bowsher <maxb_at_ukf.net>
Date: 2004-04-16 22:37:02 CEST

John Aldridge wrote:
> In message <85zn9c9gkh.fsf@newton.ch.collab.net>, kfogel@collab.net
> writes
>> Let's start from the beginning:
>
> (snip very helpful explanation of subversion deltification)
>
>> If someone now commits to /trunk/bar/blah.txt to create r4, *then* the
>> tip of trunk and the tip of branch will both have fulltexts, because
>> starting in r3, the two blah.txt files were no longer sharing storage.
>
> :
>
>> It doesn't seem likely to me that the extra fulltexts on branch tips
>> could account for the kinds of storage size differences we're seeing
>> here, anyway. I mean, yeah, if you create a lot of branches, and make
>> commits to many different files on each branch (as opposed to many
>> commits on a few files), then yes, it could affect total storage by
>> these amounts.
>
> :
>
> And, in message <85n05c6ig0.fsf@newton.ch.collab.net>
>> How does being a former RCS repository imply that every file has a
>> commit on every branch? Shouldn't it only have a commit if the file
>> was modified since being branched?
>
>
> Let me explain the hole we've dug ourselves into here, in the hope that
> someone can suggest something...
>
> Development occurs on the RCS trunk. When we come to release time (say
> version 6.0) then, for every file in the repository, we drop a label on
> the tip of the trunk...
>
> rcs -nV60: *
>
> We set up a branch label starting at that point in case we need to issue
> any patches...
>
> rcs -nV60X:V60.60 *
>
> And we force a revision onto that branch...
>
> co -rV60 *
> ci -rV60X -m"V6.0.* development branch" -f *
>
> Before continuing normal development on the mainline.
>
> To be specific, supposing (for a particular file) version 6.0 used
> revision 1.17 of a file, we now have
>
> The revision label V60 = 1.17
> The branch label V60X = 1.17.60
> And an actual revision 1.17.60.1 essentially identical to 1.17
>
> Why do we force a revision onto the branch? Because a checkout of the
> V60X branch label will not succeed unless there's at least one revision
> there (specifically, it does not fall back to check out the branch point
> on the trunk).
>
> I believe that, although CVS uses RCS format files to store data, it has
> some smarts to avoid creating the branch for a file until it is actually
> needed. Using RCS "raw" makes this a difficult strategy to manage.

Entirely correct about CVS. IIRC, CVS terminology for this is "magic
branches", which simply work by recording the symbol using notation x.y.0.z
instead of x.y.z .

> The net result is that pretty much every file in out repository has
> about 5 branches (one for each release), and that these branches /all/
> contain at least one actual revision which is identical to the trunk
> revision at which the branch is rooted. The vast majority of files
> contain just this one revision on each branch.
>
>
> The RCS strategy of storing backwards differences down the trunk, but
> then forwards differences up branches makes this a reasonably efficient
> strategy. Unfortunately, it seems to be a use-case which is not well
> supported by subversion.
>
> A I understand Karl's explanation, though, there seems to be nothing in
> the subversion data structure which "knows" that deltas go backwards
> from the tip. Is there anything I (or the cvs2scn authors, for that
> matter) can do to cause branch deltas to be built forwards from the
> branch point?

Not without editing and recompiling libsvn_fs or libsvn_repos.

> I also still don't understand the purpose of the "svnadmin deltify"
> command. When would I want/need to use this?

AFAIK, it is a useless leftover from a time when deltification was not
automatic.

> I think our fallback strategy is to remove the branches from the RCS
> files before we import them into subversion, and settle for keeping the
> original RCS data around in case we need to do any detailed research
> about anything outside the trunk. I'd rather not do this if it can be
> avoided, though.

From what I can see from your repository sample, the simplest thing to do
would be to modify cvs2svn to drop revisions in which the log message
matched "V.* development branch". For added safety, also verify the
deltatext is exactly 3 lines:

d1 1
a1 1
<something involving a keyword change>

Max.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Fri Apr 16 22:37:35 2004

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.