[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

What you probably want to know to port the FS changes to BDB (on the reintegrate branch)

From: David Glasser <glasser_at_davidglasser.net>
Date: 2007-12-15 00:46:28 CET

The reintegrate branch changes how mergeinfo queries are done, by
adding a couple of optional fields to the noderev structure and using
them instead of the sqlite database for queries. I implemented this
for FSFS; similar changes need to be made for BDB. Some of these
changes were made on the sqlite-mergeinfo-without-mergeinfo branch;
others were made directly on the reintegrate branch (which branched
from sqlite-mergeinfo-without-mergeinfo).

Most of the "write" functionality was implemented in r28134 and
r28141.

Note that a side effect of these changes is that the implementations
of the FS APIs svn_fs_get_mergeinfo and svn_fs_get_mergeinfo_for_tree
now raise errors if their arguments are nonexistent paths; this causes
some tests in our suite to fail. This will need to be fixed in the
higher-level code.

WHAT THE DATA IS
================

There are two new fields on noderevs: a boolean "has-mergeinfo" and a
number (apr_uint64_t) "mergeinfo-count". "has-mergeinfo" means that
the node itself has the svn:mergeinfo property set on it (possibly to
the empty string). "mergeinfo-count" is the number of nodes,
including the node itself, in the subtree rooted at the given node
which has the "has-mergeinfo" flag set. Thus, we can efficiently find
nodes with mergeinfo set by doing a tree crawl, skipping any subtree
where "mergeinfo-count" is not positive.

HOW TO MAINTAIN IT
==================

In general, when a node-revision is being copied (whether internally
to just clone the datastructure, or to make an FS-level copy) the two
fields should be copied as well.

"has-mergeinfo" is set by just one FS API: svn_fs_change_node_prop.
When that API is called on "svn:mergeinfo", the flag is set to
(new_value != NULL).

"mergeinfo-count" is set by several APIs:

  * svn_fs_change_node_prop
  * svn_fs_copy
  * svn_fs_revision_link
  * svn_fs_delete
  * svn_fs_merge

In general, whenever mergeinfo-count is incremented on a node, the
same increment needs to be applied to all of the node's ancestors.
(See libsvn_fs_fs/tree.c(increment_mergeinfo_up_tree).)

In svn_fs_change_node_prop, mergeinfo-count is incremented by 1 if
the has-mergeinfo flag is being changed from false to true, and
decremented by 1 if the flag is being changed from true to false.

svn_fs_copy (and its evil twin svn_fs_revision_link) can change the
mergeinfo-count for the parent of the destination. Note that copies
can actually be replaces! So first find the mergeinfo-count for the
target path (or 0 if it's not a replace), and compare that to the
copied node's mergeinfo-count; if they are different, you have to
increment the target's parent's mergeinfo-count by the difference.

svn_fs_delete should subtract the mergeinfo-count of the deleted node
from its parent's mergeinfo-count.

Finally, there's svn_fs_merge, which implements the
pre-commit-finalization merge-up phase. The bulk of its logic is in
the "merge" function. I added a mergeinfo_increment_out
output-argument, used for its recursive calls. Honestly you should
probably just grep that function for "mergeinfo", but here's a quick
overview:

  * Initialize mergeinfo_increment to 0.

  * In the "entry is the same in target and ancestor, but changed in
    the source" block, there are two possibilities:
     * It was deleted in source:
         mergeinfo_increment -= target_entry.mergeinfo_count
     * It was changed in source:
         mergeinfo_increment += (source_entry.mergeinfo_count
                                 - target_entry.mergeinfo_count)
  * When making the recursive call to 'merge', pass it the address of
    a local 'sub_mergeinfo_increment', and then do
         mergeinfo_increment += sub_mergeinfo_increment

  * For each entry added in source, add its mergeinfo-count to
    mergeinfo_increment

  * At the end of the function, increment the target node's
    mergeinfo-count by mergeinfo_increment (but *don't* recurse
    upwards; the fact that we're already recursing through the
    directory structure will take care of that), and return
    mergeinfo_increment in the provided pointer.

The top-level caller of merge can ignore the mergeinfo_increment
return value (or just pass NULL).

HOW TO QUERY IT
===============

Don't use the libsvn_fs_util implementations of
svn_fs_get_mergeinfo[_for_tree] any more. Write your own that use
this instead. Basically, just model them on
fs_get_mergeinfo[_for_tree] in libsvn_fs_fs/tree.c. In fact, some of
that code could maybe even be extracted out into libsvn_fs_util (but
not using sqlite!) if it is too similar.

Ask me if you need this explained in more detail.

Both svn_fs_get_mergeinfo_for_tree and the include_descendents=TRUE
code for svn_fs_get_mergeinfo use a crawler
"crawl_directory_dag_for_mergeinfo", which crawls a tree, only looking
for nodes with mergeinfo, and calls an action on each found node.

Note that the include_descendents parameter only exists on the
reintegrate branch.

ALSO
====

In places where the mergeinfo count is changed (like
svn_fs_fs__dag_increment_mergeinfo_count), I do some minor consistency
checks, like making sure that mergeinfo counts are never negative, and
that they are never greater than 1 on a file. We'll need to bump the
repository version so that commits from older servers don't mess up
the metadata (and cause these corruption checks to fire). I haven't
implemented this yet, though.

Let's say you took a repository with some scattered svn:mergeinfo
properties but not metadata, and upgraded it to the new version.
Could the consistency checks start failing? Actually, no. The
consistency checks only make sure that the mergeinfo-count numbers are
consistent with each other and with the has-mergeinfo flags. Note how
in fs_change_node_prop the check to see if we're setting svn:mergeinfo
for the first time is based on the former setting of ther
has-mergeinfo, not whether or not it actually used to have an
svn:mergeinfo property. This means that if you bump the FS version to
the (proposed) new filesystem format, no consistency checks can fail,
even if there are already svn:mergeinfo properties. (However, the
queries won't find these properties until they're touched or
something.)

This happens at the FS level, so dump-and-load just magically works.

Pool usage is not so great in the query code (this is mostly marked
with TODO(reint)).

-- 
David Glasser | glasser_at_davidglasser.net | http://www.davidglasser.net/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Dec 15 00:46:39 2007

This is an archived mail posted to the Subversion Dev mailing list.