[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Re: svnadmin deltify that evil thing

From: Bill Tutt <rassilon_at_lyra.org>
Date: 2002-05-30 01:34:41 CEST

> From: cmpilato@collab.net [mailto:cmpilato@collab.net]
>
> <<< Bringing this to the list so we can get more heads on it >>>
>
> "Bill Tutt" <rassilon@lyra.org> writes:
>
> > As I said in AIM TxnIDs don't have any required ordering. They could
be
> > GUIDs, or reused integers, or whatever. The TxnIDs could also be
> > pending, so we can't use that as a guide.
>
> There's no reason why we can't state that TxnIDs and CopyIDs have
> ordering, though. Because they do have ordering right now -- they are
> implemented as base-36 numbers ordered by the order in which they were
> created.
>
> I'll come back to this.
>
> > I think three straightforward and clean options are:
> > * Add "Deltified against NodeRevisionPK" to the DELTA skel
definition.
>
> I don't see the usefulness of this. We won't have DELTA skels until
> *after* we deltify, and we won't be deltifying until *after* we find
> out what to deltify against. And finding out what to deltify against
> is the whole issue here. :-)
>
> > * Or, make svnadmin deltify's syntax look like this:
> > svnadmin deltify deltasrc deltarev srcpath srcrev
>
> This won't work, either. The point of `svnadmin deltify' is force a
> path (perhaps recursively)
> to be stored as a delta (provided it is
> space-wise more efficient to do so), instead of as fulltext. Your new
> calling semantics break down when:
>
> - some path A in DELTAREV didn't change in SRCPATH (the most common
> case, by far).
>

Nothing wrong with that. The delta is empty, and we don't have any work
to do, oh darn.
 
> - some path A in DELTAREV doesn't exist in SRCPATH, but was copied
> elsewhere, or revived in a younger revision (this is an
> admittedly weird case)
>

This indeed is annoying, let's not bother letting them recurse. Let's
just deltify the item that they told us to deltify. You could also
require that the NodeIDs of that the (DELTAPATH, DELTAREV) and (SRCPATH,
SRCREV) are equivalent.

> > * Realize that people won't care about this very often, let them
suffer
> > and code as above.
>
> Perhaps this should read: Realize that most people won't care about
> having this functionality at all, and do away with `svnadmin
> deltify/undeltify' altogether. :-) I almost like the sound of that!
>

Works for me. Datastores should be as self-maintaining as possible.
External tuning knobs are bad.

> > Given those above options and facts, I'd pick #2 atm. I think we
have
> > much more important thigns for you to work on than svnadmin deltify.
> >
> > If we ever decided to do skip lists based on perf data for our
combiner
> > we'd need to add #1 in any event, but let's wait until that time
comes
> > for that.
>
> Now, back to granting an order to TxnIDs and CopyIds. If we can
> loosen their *definition* to match *reality*, we will gain a really
> straightforward algorithm for finding successors, the one I described
> on IM.
>
> We know the following things:
>
> - To get a successor, you must either clone an immutable node, or
> copy an immutable node, therefore
>
> - No successor of node revision NR has a TXN_ID < NR's TXN_ID, and
>

Actually this can be false for directories, because of auto-merging.
e.g.:
User A begins TXN1, which contains a modification to /foo. (adds file A)
User B begins TXN2, which contains a modification to /foo. (adds file B)
User B commits TXN2.
User A commits TXN1.

Auto-merging causes this TXNID ordering oddity.

However, this property does hold in the face of a subsequent copy.
(differing CopyIDs)

> - No successor of node revision NR has a COPY_ID < NR's COPY_ID.
>

If CopyIDs are monotonically increasing integers this is true.

> - Also, the `nodes' table is sorted first by NODE_ID, then by
> COPY_ID, and then by TXN_ID (as described in `structure').
>
> What we do not know:
>
> - The relative "age" of any two successors of a NR
>

Not without also looking at the associated Repository Revisions that's
correct.

> As it turns out, for the purposes of svn_fs_deltify, we don't need
> *the oldest successor*. We just need a successor to deltify against,
> and we don't want to look too far (though, technically speaking, we
> don't even *really* need a successor...you could deltify against
> anything you want to). I think the following algorithm will get us a
> successor for NR, which is likely (though not guaranteed) to be
> relatively "near" NR.
>
> Setup a cursor on NR's node revision id in the `nodes' table;
> Advance cursor to next row;
> THIS_NR = Current cursor location;
> If THIS_NR.NodeId != NR.NodeId:
> /* unrelated node, no more node revisions of NR */
> return FAILURE;
> If THIS_NR.CopyId == NR.CopyId:
> && THIS_NR.TxnId is not a pending transaction:
> /* same node_id, same copy_id, must be different (older!) txn_id
*/
I think you mean younger or older txn_id due to auto-merging, but yes,
it will certainly work as the source of a delta. :)

> return SUCCESS, THIS_NR;
> ELSE:
> DO:
> IF THIS_NR.TxnId > NR.TxnId:
This is implied by the data model, and the column ordering of the table.
i.e.:
having /foo with NodeID1 have the following data in the repository is
illegal:
NodeID1.C1.T1
NodeID1.C2.T1

That is: While NodeID.CopyID.TxnID is indeed a unique key for the entire
table. The number of rows that have a given NodeID.TxnID with the same
CopyID is 0.

> && THIS_NR.TxnId is not a pending transaction:
> /* same node_id, older copy_id, older txn_id */
> return SUCCESS, THIS_NR;
> Advance cursor to next row;
> THIS_NR = Current cursor location;
> WHILE (THIS_NR.NodeId == NR.NodeId)
> return FAILURE;
>

To sum up: Yes, your algorithm works, but it might incorrectly fail,
where mine never will. The major simplification of your algorithm is
that it doesn't have to deal with calculating the appropriate successor
CopyID. That doesn't seem simplification enough to bother not using the
ordering.

> However, I realize that adding ordering to those IDs is probably not a
> popular thought. :-)
>

Indeed, this is so. :)

Bill

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 1 14:22:41 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.