[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Issue #1911 -- Seeking thinker donations

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: 2005-09-29 17:28:00 CEST

Dev-folk, I took a shot at solving issue #1911 last night. Well,
actually, I took two shots (both use the same solution, the second one
uses it with better performance).

Here's the problem: svndumpfilter allows you to optionally drop
altogether revisions which, after path-based filtering, have no
meaningful changes. So, if you were filtering such that only commits
to /trunk were included, then commits where only branch work was done
could be completely dropped from the stream (avoiding empty
revisions). This generally works well.

But because of Subversion's global revision number thingy, and the
fact that versions of resources are addressable by all revisions in
which those resources were "live" (instead of only at points where
they themselves were actually changed), we can have a scenario where a
copy was made with a source path and destination path that are not
excluded by the filter, but where the copy source revision lies in a
revision that the filter has dropped. It's not hard to construct an
example of this:

   $ svn copy file:///.../trunk file:///.../branches/my-branch
   $ svn copy file:///.../trunk/foo file:///.../trunk/bar
HEAD at the time of that second copy is a revision which an
only-let-trunk-thru filter will drop.

Internally, svndumpfilter uses a hash which maps original revisions to
actual revisions, so that it can fixup copyfrom source revisions to
account for dropped revisions. Unfortunately, the logic that purports
to handle scenarios like the above was looking for sentinels that were
never actually used. The result was dumpfiles which had copyfrom
source revisions of some revision N present as part of the creation of
revision N! Fixing that leaves you with code that doesn't generate
invalid dumpfiles -- instead svndumpfilter errorfully bails out of
these really trivial cases (again, like the example above).

My change to svndumpfilter was to make it say, "If the copyfrom source
revision is one which I've dropped, then use the youngest not-dropped
revision which is older than that dropped revision." (And again, I
implemented this twice: once brute-force, once with some class.)

Here's where I'm getting hung up, though. I've not fully convinced
myself that this is a safe algorithm to fallback to. And it's at this
point that I'm hoping others on the list can toss the problem around
in their heads long enough to help me figure out if there's a chance
that using this algorithm will cause copyfrom-path@copyfrom-rev to
ever be the wrong thing.

Things to remember:

   * Revisions with path changes that aren't excluded by the filter
     (non-empty revisions) cannot be dropped.
   * Copies whose source paths are excluded by the filter, but whose
     destinations are not, cause errors -- EXCEPT in the case where it
     is a file copy with modifications (because we have the fulltext
     in the stream; we just treat it in the output stream as a


C. Michael Pilato <cmpilato@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Sep 29 18:50:54 2005

This is an archived mail posted to the Subversion Dev mailing list.