[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svndumpfilter - rfc.

From: Philip Martin <philip_at_codematters.co.uk>
Date: 2003-03-27 21:51:31 CET

Alexander Sabourenkov <screwdriver@lxnt.info> writes:

> Having almost no response to previous posting, and having
> somewhat cleaned up the code, here is another request for comments.
>
> Patch against r5307 is available at
> http://lxnt.info/sdf.patch

This email was prompted by recent svndumpfilter emails, is the above
URL the most recent patch? Personally, I find inline patches the
easiest to review, external URLs the hardest, and attachments
somewhere in between.

> It is a massively mutated svnadmin. Thus, among other temporary
> shortcuts, --exclude and --include options became exclude and
> include subcommands, taking path-prefixes as rest of arguments.

The man page (Documentation!) still refers to --exclude and --include.

> It does work, at least it managed to successfully (i.e. results loaded ok,
> checked out ok) filter a ~48Mb dumpfile of a relatively simple repository
>
> While working on it a bunch of questions came up. They are:
>
> Questions:
>
> Should I turn subcommands back into options (keeping in mind that
> they are mutually exclusive)?
>
> Should parser recalculate & verify MD5 sums?
> (it now just passes them through intact)

I think it should just pass them through.

> What to do, if any, with copyfrom data?
> (it is now just passed through intact)
> This is wrong when a dump contains nodes with such information.
> Nodes and revisions the copyfrom data points to can be filtered out,
> and I suppose svnadmin will refuse to load such filtered dump.
> However I'm at a loss what to do with them.
> Given that the pointed-to revisions and nodes haven't been filtered out,
> I can rewrite the revision number to point to correct revision.
> But having initial revs/nodes filtered out I can only think of dropping them.

Do you mean
  a) drop the copyfrom history but retain the node
  b) drop the node?

I think b) would make more sense.

> What to do with revisions that only contain rev-props? Like rev 0?
> What to do with revisions that contain rev-props, but have all
> nodes filtered out?
>
> Current behaviour is:
>
> Revision is written out in the following cases:
> 1. No --drop-empty-revs has been supplied.
> 2. Revision has nodes remaining after filtering.
> 3. Revision had no nodes before filtering.

That sounds reasonable.

> Has retaining original revision numbers when some revisions get dropped
> any sense? Or they should be unconditionally renumbered?
>
> Current behavior is to renumber if a revision is skipped.
> AFAIK revnumbers are not taken into account when loading a dump, so this
> seems harmless (and I implemented renumbering before I had a chance to think
> about it :) ). This also has a side effect so that implementing
> shift in rev-numbers in the resulting dumpstream is easy (unsure of it having
> any sense).

There are arguments for and against renumbering, ideally the user
would be able to choose.

> BTW: currently svnadmin treats a
> text-content-length: 0
> header as an attempt to set fulltext on a node. Should this be so?

No idea.

Do you think this is ready to commit? Do you have an up-to-date
patch? Regression tests would be nice, although not strictly
necessary.

-- 
Philip Martin
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Mar 27 21:52:15 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.