[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: What you probably want to know to port the FS changes to BDB (on the reintegrate branch)

From: David Glasser <glasser_at_davidglasser.net>
Date: Thu, 3 Jan 2008 11:21:43 -0500

This is the main thing stopping the reintegrate branch from being
merged to trunk. I could work on this, but I'm not too familiar with
BDB. Mike, you mentioned being interested in helping with this; do
you think you'll get a chance to any time soon?

Thanks,
dave

On Dec 14, 2007 6:46 PM, David Glasser <glasser_at_davidglasser.net> wrote:
> The reintegrate branch changes how mergeinfo queries are done, by
> adding a couple of optional fields to the noderev structure and using
> them instead of the sqlite database for queries. I implemented this
> for FSFS; similar changes need to be made for BDB. Some of these
> changes were made on the sqlite-mergeinfo-without-mergeinfo branch;
> others were made directly on the reintegrate branch (which branched
> from sqlite-mergeinfo-without-mergeinfo).
>
> Most of the "write" functionality was implemented in r28134 and
> r28141.
>
> Note that a side effect of these changes is that the implementations
> of the FS APIs svn_fs_get_mergeinfo and svn_fs_get_mergeinfo_for_tree
> now raise errors if their arguments are nonexistent paths; this causes
> some tests in our suite to fail. This will need to be fixed in the
> higher-level code.
>
> WHAT THE DATA IS
> ================
>
> There are two new fields on noderevs: a boolean "has-mergeinfo" and a
> number (apr_uint64_t) "mergeinfo-count". "has-mergeinfo" means that
> the node itself has the svn:mergeinfo property set on it (possibly to
> the empty string). "mergeinfo-count" is the number of nodes,
> including the node itself, in the subtree rooted at the given node
> which has the "has-mergeinfo" flag set. Thus, we can efficiently find
> nodes with mergeinfo set by doing a tree crawl, skipping any subtree
> where "mergeinfo-count" is not positive.
>
> HOW TO MAINTAIN IT
> ==================
>
> In general, when a node-revision is being copied (whether internally
> to just clone the datastructure, or to make an FS-level copy) the two
> fields should be copied as well.
>
> "has-mergeinfo" is set by just one FS API: svn_fs_change_node_prop.
> When that API is called on "svn:mergeinfo", the flag is set to
> (new_value != NULL).
>
> "mergeinfo-count" is set by several APIs:
>
> * svn_fs_change_node_prop
> * svn_fs_copy
> * svn_fs_revision_link
> * svn_fs_delete
> * svn_fs_merge
>
> In general, whenever mergeinfo-count is incremented on a node, the
> same increment needs to be applied to all of the node's ancestors.
> (See libsvn_fs_fs/tree.c(increment_mergeinfo_up_tree).)
>
> In svn_fs_change_node_prop, mergeinfo-count is incremented by 1 if
> the has-mergeinfo flag is being changed from false to true, and
> decremented by 1 if the flag is being changed from true to false.
>
> svn_fs_copy (and its evil twin svn_fs_revision_link) can change the
> mergeinfo-count for the parent of the destination. Note that copies
> can actually be replaces! So first find the mergeinfo-count for the
> target path (or 0 if it's not a replace), and compare that to the
> copied node's mergeinfo-count; if they are different, you have to
> increment the target's parent's mergeinfo-count by the difference.
>
> svn_fs_delete should subtract the mergeinfo-count of the deleted node
> from its parent's mergeinfo-count.
>
> Finally, there's svn_fs_merge, which implements the
> pre-commit-finalization merge-up phase. The bulk of its logic is in
> the "merge" function. I added a mergeinfo_increment_out
> output-argument, used for its recursive calls. Honestly you should
> probably just grep that function for "mergeinfo", but here's a quick
> overview:
>
> * Initialize mergeinfo_increment to 0.
>
> * In the "entry is the same in target and ancestor, but changed in
> the source" block, there are two possibilities:
> * It was deleted in source:
> mergeinfo_increment -= target_entry.mergeinfo_count
> * It was changed in source:
> mergeinfo_increment += (source_entry.mergeinfo_count
> - target_entry.mergeinfo_count)
> * When making the recursive call to 'merge', pass it the address of
> a local 'sub_mergeinfo_increment', and then do
> mergeinfo_increment += sub_mergeinfo_increment
>
> * For each entry added in source, add its mergeinfo-count to
> mergeinfo_increment
>
> * At the end of the function, increment the target node's
> mergeinfo-count by mergeinfo_increment (but *don't* recurse
> upwards; the fact that we're already recursing through the
> directory structure will take care of that), and return
> mergeinfo_increment in the provided pointer.
>
> The top-level caller of merge can ignore the mergeinfo_increment
> return value (or just pass NULL).
>
> HOW TO QUERY IT
> ===============
>
> Don't use the libsvn_fs_util implementations of
> svn_fs_get_mergeinfo[_for_tree] any more. Write your own that use
> this instead. Basically, just model them on
> fs_get_mergeinfo[_for_tree] in libsvn_fs_fs/tree.c. In fact, some of
> that code could maybe even be extracted out into libsvn_fs_util (but
> not using sqlite!) if it is too similar.
>
> Ask me if you need this explained in more detail.
>
> Both svn_fs_get_mergeinfo_for_tree and the include_descendents=TRUE
> code for svn_fs_get_mergeinfo use a crawler
> "crawl_directory_dag_for_mergeinfo", which crawls a tree, only looking
> for nodes with mergeinfo, and calls an action on each found node.
>
> Note that the include_descendents parameter only exists on the
> reintegrate branch.
>
> ALSO
> ====
>
> In places where the mergeinfo count is changed (like
> svn_fs_fs__dag_increment_mergeinfo_count), I do some minor consistency
> checks, like making sure that mergeinfo counts are never negative, and
> that they are never greater than 1 on a file. We'll need to bump the
> repository version so that commits from older servers don't mess up
> the metadata (and cause these corruption checks to fire). I haven't
> implemented this yet, though.
>
> Let's say you took a repository with some scattered svn:mergeinfo
> properties but not metadata, and upgraded it to the new version.
> Could the consistency checks start failing? Actually, no. The
> consistency checks only make sure that the mergeinfo-count numbers are
> consistent with each other and with the has-mergeinfo flags. Note how
> in fs_change_node_prop the check to see if we're setting svn:mergeinfo
> for the first time is based on the former setting of ther
> has-mergeinfo, not whether or not it actually used to have an
> svn:mergeinfo property. This means that if you bump the FS version to
> the (proposed) new filesystem format, no consistency checks can fail,
> even if there are already svn:mergeinfo properties. (However, the
> queries won't find these properties until they're touched or
> something.)
>
> This happens at the FS level, so dump-and-load just magically works.
>
> Pool usage is not so great in the query code (this is mostly marked
> with TODO(reint)).
>
>
> --
> David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
>

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-01-04 03:44:50 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.