[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

svndumpfilter - rfc.

From: Alexander Sabourenkov <screwdriver_at_lxnt.info>
Date: 2003-03-13 11:22:55 CET

Hello.

Having almost no response to previous posting, and having
somewhat cleaned up the code, here is another request for comments.

Patch against r5307 is available at
http://lxnt.info/sdf.patch

It is a massively mutated svnadmin. Thus, among other temporary
shortcuts, --exclude and --include options became exclude and
include subcommands, taking path-prefixes as rest of arguments.

It does work, at least it managed to successfully (i.e. results loaded ok,
checked out ok) filter a ~48Mb dumpfile of a relatively simple repository

While working on it a bunch of questions came up. They are:

Questions:

Should I turn subcommands back into options (keeping in mind that
they are mutually exclusive)?

Should parser recalculate & verify MD5 sums?
(it now just passes them through intact)

What to do, if any, with copyfrom data?
(it is now just passed through intact)
This is wrong when a dump contains nodes with such information.
Nodes and revisions the copyfrom data points to can be filtered out,
and I suppose svnadmin will refuse to load such filtered dump.
However I'm at a loss what to do with them.
Given that the pointed-to revisions and nodes haven't been filtered out,
I can rewrite the revision number to point to correct revision.
But having initial revs/nodes filtered out I can only think of dropping them.

What to do with revisions that only contain rev-props? Like rev 0?
What to do with revisions that contain rev-props, but have all
nodes filtered out?

Current behaviour is:

Revision is written out in the following cases:
  1. No --drop-empty-revs has been supplied.
  2. Revision has nodes remaining after filtering.
  3. Revision had no nodes before filtering.

Has retaining original revision numbers when some revisions get dropped
any sense? Or they should be unconditionally renumbered?

Current behavior is to renumber if a revision is skipped.
AFAIK revnumbers are not taken into account when loading a dump, so this
seems harmless (and I implemented renumbering before I had a chance to think
about it :) ). This also has a side effect so that implementing
shift in rev-numbers in the resulting dumpstream is easy (unsure of it having
any sense).

BTW: currently svnadmin treats a
text-content-length: 0
header as an attempt to set fulltext on a node. Should this be so?

PS:
References

initial posting
http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgId=234357

previous rfc
http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgId=235763

-- 
./lxnt
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Mar 13 11:23:53 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.