The reintegrate branch changes how mergeinfo queries are done, by
adding a couple of optional fields to the noderev structure and using
them instead of the sqlite database for queries. I implemented this
for FSFS; similar changes need to be made for BDB. Some of these
changes were made on the sqlite-mergeinfo-without-mergeinfo branch;
others were made directly on the reintegrate branch (which branched
from sqlite-mergeinfo-without-mergeinfo).
Most of the "write" functionality was implemented in r28134 and
r28141.
Note that a side effect of these changes is that the implementations
of the FS APIs svn_fs_get_mergeinfo and svn_fs_get_mergeinfo_for_tree
now raise errors if their arguments are nonexistent paths; this causes
some tests in our suite to fail. This will need to be fixed in the
higher-level code.
WHAT THE DATA IS
================
There are two new fields on noderevs: a boolean "has-mergeinfo" and a
number (apr_uint64_t) "mergeinfo-count". "has-mergeinfo" means that
the node itself has the svn:mergeinfo property set on it (possibly to
the empty string). "mergeinfo-count" is the number of nodes,
including the node itself, in the subtree rooted at the given node
which has the "has-mergeinfo" flag set. Thus, we can efficiently find
nodes with mergeinfo set by doing a tree crawl, skipping any subtree
where "mergeinfo-count" is not positive.
HOW TO MAINTAIN IT
==================
In general, when a node-revision is being copied (whether internally
to just clone the datastructure, or to make an FS-level copy) the two
fields should be copied as well.
"has-mergeinfo" is set by just one FS API: svn_fs_change_node_prop.
When that API is called on "svn:mergeinfo", the flag is set to
(new_value != NULL).
"mergeinfo-count" is set by several APIs:
* svn_fs_change_node_prop
* svn_fs_copy
* svn_fs_revision_link
* svn_fs_delete
* svn_fs_merge
In general, whenever mergeinfo-count is incremented on a node, the
same increment needs to be applied to all of the node's ancestors.
(See libsvn_fs_fs/tree.c(increment_mergeinfo_up_tree).)
In svn_fs_change_node_prop, mergeinfo-count is incremented by 1 if
the has-mergeinfo flag is being changed from false to true, and
decremented by 1 if the flag is being changed from true to false.
svn_fs_copy (and its evil twin svn_fs_revision_link) can change the
mergeinfo-count for the parent of the destination. Note that copies
can actually be replaces! So first find the mergeinfo-count for the
target path (or 0 if it's not a replace), and compare that to the
copied node's mergeinfo-count; if they are different, you have to
increment the target's parent's mergeinfo-count by the difference.
svn_fs_delete should subtract the mergeinfo-count of the deleted node
from its parent's mergeinfo-count.
Finally, there's svn_fs_merge, which implements the
pre-commit-finalization merge-up phase. The bulk of its logic is in
the "merge" function. I added a mergeinfo_increment_out
output-argument, used for its recursive calls. Honestly you should
probably just grep that function for "mergeinfo", but here's a quick
overview:
* Initialize mergeinfo_increment to 0.
* In the "entry is the same in target and ancestor, but changed in
the source" block, there are two possibilities:
* It was deleted in source:
mergeinfo_increment -= target_entry.mergeinfo_count
* It was changed in source:
mergeinfo_increment += (source_entry.mergeinfo_count
- target_entry.mergeinfo_count)
* When making the recursive call to 'merge', pass it the address of
a local 'sub_mergeinfo_increment', and then do
mergeinfo_increment += sub_mergeinfo_increment
* For each entry added in source, add its mergeinfo-count to
mergeinfo_increment
* At the end of the function, increment the target node's
mergeinfo-count by mergeinfo_increment (but *don't* recurse
upwards; the fact that we're already recursing through the
directory structure will take care of that), and return
mergeinfo_increment in the provided pointer.
The top-level caller of merge can ignore the mergeinfo_increment
return value (or just pass NULL).
HOW TO QUERY IT
===============
Don't use the libsvn_fs_util implementations of
svn_fs_get_mergeinfo[_for_tree] any more. Write your own that use
this instead. Basically, just model them on
fs_get_mergeinfo[_for_tree] in libsvn_fs_fs/tree.c. In fact, some of
that code could maybe even be extracted out into libsvn_fs_util (but
not using sqlite!) if it is too similar.
Ask me if you need this explained in more detail.
Both svn_fs_get_mergeinfo_for_tree and the include_descendents=TRUE
code for svn_fs_get_mergeinfo use a crawler
"crawl_directory_dag_for_mergeinfo", which crawls a tree, only looking
for nodes with mergeinfo, and calls an action on each found node.
Note that the include_descendents parameter only exists on the
reintegrate branch.
ALSO
====
In places where the mergeinfo count is changed (like
svn_fs_fs__dag_increment_mergeinfo_count), I do some minor consistency
checks, like making sure that mergeinfo counts are never negative, and
that they are never greater than 1 on a file. We'll need to bump the
repository version so that commits from older servers don't mess up
the metadata (and cause these corruption checks to fire). I haven't
implemented this yet, though.
Let's say you took a repository with some scattered svn:mergeinfo
properties but not metadata, and upgraded it to the new version.
Could the consistency checks start failing? Actually, no. The
consistency checks only make sure that the mergeinfo-count numbers are
consistent with each other and with the has-mergeinfo flags. Note how
in fs_change_node_prop the check to see if we're setting svn:mergeinfo
for the first time is based on the former setting of ther
has-mergeinfo, not whether or not it actually used to have an
svn:mergeinfo property. This means that if you bump the FS version to
the (proposed) new filesystem format, no consistency checks can fail,
even if there are already svn:mergeinfo properties. (However, the
queries won't find these properties until they're touched or
something.)
This happens at the FS level, so dump-and-load just magically works.
Pool usage is not so great in the query code (this is mostly marked
with TODO(reint)).
--
David Glasser | glasser_at_davidglasser.net | http://www.davidglasser.net/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Dec 15 00:46:39 2007