Re: svndumpfilter wishlist

From: Marcus Rueckert <darix_at_web.de>
Date: Wed, 9 Jan 2008 08:49:13 +0100

On 2008-01-09 00:49:02 +0100, Ragnar Kjørstad wrote:
> I'm in the process of taking a single svn repository that contains code
> for two seperate products and splitting it into two seperate repository.
> Unfortunately, this turns out to be a lot more complicated than it
> should be. Thus, some suggestions for improvements to svndumpfilter that
> would be very helpful. Unfortunately I will probably not find time to
> implement them, but I'm posting them anyway as input for whoever is
> interested.
>
>
> * better pattern matching. (Globbing or regular expressions)
> Just prefixes is not sufficient in many cases. E.g. I wanted to exclude
> the "foo" module but not the "foobar" module. I don't think this is
> possible with the current svndumpfilter (I ended up excluding "foo/" but
> then I end up with an empty "foo" directory). Also I wanted to exclude
> certain modules in all tags/branches. Now I had to preprocess and create
> an explisit list of prefixes. "exclude tags/*/foo" would have been soo
> much nicer.
>
> * reading rules from file
> The current svndumpfilter expects all exclude rules as command line
> arguments. There are upper limits for command line length, so this does
> not scale well. Support for reading the rules from a file would be nice.
> Of course if it had globbing or regexp support we would have needed far
> fewer rules so it probably wouldn't have been a problem.
>
> * improve performance
> svndumpfilter is taking a very long time to match transactions towards
> the exclude-list. Performance could be improved dramatically by keeping
> the prefixes in an ordered list or a b+tree or something and not compare
> every path to every single prefix. Of course, if it used proper pattern
> matching rather than prefixes this would be less critical.
>
> * revision numbers in rules.
> It's currently impossible to exclude some commits to a path but not
> others. E.g. I have a tag that was created, deleted and recreated. I
> want to exclude the first directory, but not the second. If I could
> exclude "/tags/foo-XXX_at_1-500" that would have solved the problem.
>
> * handle invalid copies better.
> Currently svndump fails if the sourcepath of a copy operation is
> excluded but not the destination. Maybe it would be better if the user
> could choose between different ways to handle this problem:
> 1. exclude the copy operation as well (a bit dangerous, but...)
> 2. not exclude the source path anyway
> 3. convert the copy to an add operation. So the file will be included
> correctly, but history will be excluded.
> 4. move the history of the file from the source path to the destination
> of the copy. So, e.g. if a file is copied from "a" to "b" to "c" and
> "b" is excluded, it would be replaced with a copy operation from "a".
> 5. Fail.
> I see that there are some svndumpfilter reimplementations that implement
> nr 3, but I haven't seen any that implement nr 4.
>
> * svndumpfilterlib
> The svndumpfilter tool could be improved in several ways to cover more
> usecases, but even so there are likely a lot of cases that will require
> custom code. Therefor it would be very good if svndumpfilter provided an
> library that could be extended and customized it rather than
> start from scratch each time. Primarely the library should have the
> following functionality:
> * dumpfile parsing
> * dumpfile generation
> * filtering
> * invalid copy handling functions
> In terms of a python API the parsing function should be a generator that
> generates Transaction instances. The dumpfile generator should consume
> an iterator of Transaction instances. Filters and copy hanlding
> functions should take one Transaction object as input and should return
> a (optionally modified) Transaction object. This would make it really
> easy to extend svndumpfilter with new functionality.

you know http://queen.borg.ch/subversion/svndumptool/ ?

darix

-- 
           openSUSE - SUSE Linux is my linux
               openSUSE is good for you
                   www.opensuse.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org

Received on 2008-01-09 08:49:30 CET

This message: [ Message body ]
Next message: Branko ÄŒibej: "Re: Mixed Direction Merges and Range Compaction (Was: Auto-selection of merge source URL)"
Previous message: Joe Swatosh: "Re: [PATCH] SWIG binding for better access to svn_client_get_changelists"
In reply to: Ragnar Kjørstad: "svndumpfilter wishlist"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]