[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svndumpfilter and svnsync?

From: Chris <devnullaccount_at_yahoo.se>
Date: Wed, 10 Oct 2018 09:18:52 +0000 (UTC)

Big thanks for the help, it is greatly appreciated!
Some comments and further questions inline below.

>>
>> On Oct 10, 2018, at 02:04, Chris wrote:
>>
>>> I've trawled through bad commits of data files in our repo and added
> such paths to a filter file that I'm using for svndumpfilter to get a
> reasonably-looking dump. In most cases, the files in question existed in
> a single path(branch( and were no problem. But in some cases, the same
> files had been copied to a 2nd branch and then svndumpfilter gave me
> errors about missing source paths, so I added the same path on the 2nd
> branch to the filter expressions and tried again. After a few iterations
> of this process, I have a dump that should do what I want.
>>> So I start "svnadmin load" and based on initial progress, that might
> take a couple of days to complete so I leave it overnight. I get back
> today and the load has crashed with a missing path. The error was:
>>>
>>> svnadmin: E160013: File not found: transaction '16289-ckh', path
>>> 'branches/second/dir/datafile'
>>>
>>> And looking up the history for that file, I see that "datafile" was
> added on branch "first" but the path "branches/first/dir" is already in
> my filter list. So why didn't svndumpfilter throw me an error on this
> like it did for a lot of other cases?
>>> Since the load process it so much slower, the turnaround time for
> each error in that step is beyond painful, so if there's anything that I
> can do to assure that this gets caught by the filter would make my life
> a lot easier.
>>
>> I don't know the answer to that, but:
>
> Hm, not really a clear answer here either. I don't know why
> svndumpfilter did not detect these.
>
> However, you might also give 'svnadmin dump --exclude' a try, if you can
> use version 1.10 of svnadmin.
> http://subversion.apache.org/docs/release-notes/1.10.html#dump-include-
> exclude
>
> This feature works similarly to 'svnsync with an authz file that
> denies the excluded files'. That means that, when the source of a copy
> is excluded, the copy is transformed into an add (so to complete
> eliminate a bad file and all its copies this might be more difficult
> to get a hold of these copies ... you won't get any warnings or errors
> I think -- not sure if it emits a notification for such a copy-to-add
> conversion). OTOH, 'svnadmin dump --exclude' supports wildcards if you
> add the --pattern option, so it might be easier to filter out all
> appearances of a specific filename, as in 'svnadmin dump --pattern
> --exclude /*/datafile'.

I'll try that. Will be a monster of a commandline since dump+exclude
doesn't have the "-target <file>" from svndumpfilter and I have 150-ish
exclude-statements, but should be doable.
Not sure how much I can use patterns based on how the bad commits looked,
but should compress the commandline somewhat.

>
>
>>
>>> The syntax I used: svnadmin dump -q MYREPO | svndumpfilter exclude
>>> --targets filterfile filterdump svnadmin load -q --no-flush-to-disk
>>> --force-uuid -M 2048 --bypass- prop-validation ./NEWREPO < filterdump
>>>
>>> (I had to use the bypass-prop-validation due to some newline issues
> in old log message, similar to this one
> https://groups.google.com/forum/#!topic/subversion_users/P3ohZ-hKhCA,
> don't know why they have wrong newlines, but the repo works as it is
> now...)
>>
>> Instead of ignoring wrong newlines, you could fix them using
>> svndumptool (using its eolfix-revprop command), originally at:
>>
>> http://svn.borg.ch/svndumptool/
>>
>> Newer fork at:
>>
>> https://github.com/jwiegley/svndumptool
>
> Also, as of version 1.10, svnadmin finally has an option to normalize
> these on-the-fly during 'load':
> http://subversion.apache.org/docs/release-notes/1.10.html#normalize-
> props
>
> It's a lot better to normalize these (either with the
> --normalize-props option for 'svnadmin load' or by using svndumptool)
> than to "bypass" them. Otherwise you'll run into this again later (if
> you would dump+load again sometime in the future).

I tried --normalize-props and I still got the same error which is why I
switched over to bypass. Maybe I've run into some bug with --normalize-props.
Unfortunately, I don't think I'll be able to create a script for reproducing
the error since it happens far into a monster dump load.
So I'll stick with the bypass for now or try the tool that Ryan suggested.

>
> And another tip: put the repo-to-be-loaded-into (NEWREPO) on as fast a
> storage system as possible (SSD, ramdisk if feasible, ...). If you're
> satisfied with the result, run 'svnadmin pack' on that fast storage,
> and only then copy it over to the final location. Depending on the
> final storage that technique might save you a lot of time (especially
> if you have to redo it a couple of times).

True, I should have thought of that myself.
I'll see what I can do here. Corporate IT policies puts some restraints
on me, but definitely worth a shot. Just need to manage to
install a svn 1.10 on the only machine I have root on, which is a too-old
ubuntu where I can't find any pre-built packages. And overcome too-small
disk on that machine well.
But those are my own problems that I need to find ways around:)

>
>>> An additional question about what Johan wrote below:
>>>> - You can perfectly well use a 1.10 version of svnadmin or svnsync
>>>> (or svnrdump, to create a dumpfile from a remote server) to interact
>>>> with a 1.8 server /
> repository.
>>>
>>> Can I even do this with "svnadmin load"; I thought that would use an
> FSFS version 8 while 1.8 should have 6? I got that impression from my
> "research", but I'm probably off base.
>>
>> If you use a newer version of svnadmin (than the one that will be used
> to serve the repo) to create the new repo and load the dump file, then
> make sure you pass the right --compatible-version argument to svnadmin
> create.
>
> Indeed. It's at 'svnadmin create' time that the FSFS version is
> decided. 'svnadmin load' will just "commit" new revisions in the
> repository that you first created, and it will follow / respect the
> FSFS format that's already set. So it's perfectly doable to create and
> load a NEWREPO with 1.10 svnadmin, which you intend to be served by a
> 1.8 svn server (as long as you use the --compatible-version argument
> at create time).

Good, I will use the compatible-version argument, must have missed that one.

> (Small note though: 1.8 is no longer supported, so if
> you can, plan to do an upgrade to 1.9 or preferably 1.10 soon).

Yes, I've tried to get the server upgraded since 1.10 came out, but no
luck so far.

Again, huge thanks for the help!

BR,
  Chris
Received on 2018-10-10 11:19:12 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.