[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Almost repetitive repository corruption

From: Igor Varfolomeev <i3v_at_mail.ru>
Date: Tue, 31 Dec 2013 06:07:12 +0400

> -----Original Message-----
> From: Bert Huijben [mailto:bert_at_qqmail.nl]
> Sent: 30 December, 2013 02:58
> To: 'Igor Varfolomeev'; users_at_subversion.apache.org
> Subject: RE: Almost repetitive repository corruption
>
>
> > -----Original Message-----
> > From: Igor Varfolomeev [mailto:i3v_at_mail.ru]
> > Sent: zondag 29 december 2013 23:00
> > To: users_at_subversion.apache.org
> > Subject: Almost repetitive repository corruption
> >
> > Hi all,
> >
> > I’ve just ran into a weird bug which damaged my svn repository. I
> > still don’t understand what exactly was wrong, so, I don’t know how to
> > describe it in a clear and simple manner, sorry… I’ll just try to
> > describe all the symptoms I’ve experienced. I’ll use real file names,
> > since I wasn’t able to reproduce this bug on synthetic test repository.
> >
> > *SETUP*
> > Most simple single-user, single-PC setup. Local repository.
> > First svn version: “Subversion command-line client, version 1.8.5.”.
> > Windows 7 x64
> > Antivirus: Kaspersky Endpoint Security 10
> >
> > *THE STORY*
> > The story began, when I ran into some sort of error message, while
> > trying to commit r3349.
> > After a bit of struggling, I’ve realized, that my repository got
> > broken after previous commit (r3348). Nasty thing is that previous
> > commit finished without any error message.
> >
> > *SYMPTOMS*
> > **svn verify**
> > Output ends like this:
> > <….>
> > * Verified revision 3346.
> > * Verified revision 3347.
> > svnadmin: E160004:
> > Corrupt node-revision '4d-610.2-2392.r3348/35659066'
> > svnadmin: E160004: Found malformed header '' in revision file
> >
> > **svn checkout**
> > When I try to checkout a new working copy, I receive similar
> > message:
> > <…>
> > W:\testCO\Binar\Matlab\deploy
> > W:\testCO\Binar\Matlab\deploy\x64
> > W:\testCO\Binar\Matlab\deploy\x64\Binar_x64.prj
> > W:\testCO\Binar\Matlab\deploy\x64\Binar_x64
> > W:\testCO\Binar\Matlab\deploy\x64\Binar_x64\distrib
> > Corrupt node-revision '4d-610.2-2392.r3348/35659066'
> > Found malformed header '' in revision file
> >
> > **svn Repository Browser**
> > When I navigate to
> > file:///V:/R_Matlab/Binar/trunk/Binar/Matlab/deploy/x64/Binar_x64
> > in tortoise svn repository browser, I see the same error message:
> >
> > Corrupt node-revision '4d-610.2-2392.r3348/35659066'
> > Found malformed header '' in revision file
> >
> > Here’s a screenshot: http://sdrv.ms/1fJVuwa
> >
> > *ZEROS IN DATA FILE*
> > Luckily, I have a full backup (r3337). I’ve manually repeated all my
> > commits up to r3347 and verified that at this state repository is OK.
> >
> > Next, I’ve tried to reproduce the bug:
> >
> > 1. Firstly (“try1”), I’ve repeated same Matlab commit script
> > (Matlab simply calls svn, just like from cmd). And… «success»
> > - same bug again!
> >
> > 2. Secondly (“try3”), I’ve managed to reproduce the bug using
> > only windows cmd commands.
> >
> > 3. Thirdly (“try4” and “try5(0)”), I wrote a bat-script to
> > reproduce the same actions.
> >
> > I’ve compared
> > R_Matlab\db\revs\3\3348
> > file for different “tries”: (initial bug is designated as “try0”) and
> > discovered a single interesting thing:
> > each “3348” file has a long sequence of zero-bytes:
> >
> > • try0: 0x2201B0A to 0x2201FFF
> >
> > • try1: 0x2201000 to 0x2201FFF
> > o try0_vs_try1_p1: http://sdrv.ms/Ju7nev
> > o try0_vs_try1_p2: http://sdrv.ms/Ju7tmu
> > o try0_vs_try1_p3: http://sdrv.ms/Ju7AOI
> >
> > • try3: 0x2201B11 to 0x2201FFF
> > o try0_vs_try3_p1: http://sdrv.ms/Ju7G9g
> > o try0_vs_try3_p2: http://sdrv.ms/Ju7HKd
> >
> > • try4: 0x2201000 to 0x2201FFF
> > o try0_vs_try4_p1: http://sdrv.ms/Ju7OFE
> > o try0_vs_try4_p2: http://sdrv.ms/Ju86MJ
> > o try0_vs_try4_p3: http://sdrv.ms/Ju89ID
> >
> > • try5(0): 0x2201000 to 0x2201FFF (just like try4).
> > o try0_vs_try5(0)_p1: http://sdrv.ms/1daKwjG
> > o try0_vs_try5(0)_p2: http://sdrv.ms/1daKxUx
> > o try0_vs_try5(0)_p3: http://sdrv.ms/Ju8iM5
> >
> >
> > Moreover, try4 and try5 have only one single difference, two zero-
> > bytes, starting from 0x21F9FFE (in case of “try5(0)”):
> > http://sdrv.ms/19jmBdm
> >
> > *BUG DISAPPERED*
> > That’s all I have. 5 broken repositories. After that bug DISAPPEARED.
> > Just like a UFO :) . I’ve launched the SAME script, with the SAME
> > input data 10 more times (“try5(1)”,”try5(2)”…) – nothing – svn
> > correctly commits r3348, resulting repository is valid:
>
> Hi,
>
> Did you make sure you restored the db\rep-cache.db in every step. (This
> may make difference then you expected)
>
> The fact that you copy a single file two times in one commit makes me expect
> that this is relevant information.
>
> Are all the drives in your test scenario local harddisk or are some network
> drives involved?
>
> Bert
==============================================================
[Igor Varfolomeev]

> Are all the drives in your test scenario local harddisk or are some network
> drives involved?

Only local HDDs.

> Did you make sure you restored the db\rep-cache.db in every step. (This
> may make difference then you expected)

*COPY PROCESS*
Hm... I've simply copied all files from source (r3347) repository to a new folder
to create an "experimental repository" (independently for each "try")..
Source repository was not accessed anyhow during copy process...
For "try1".."try4" I did it manually, during "try5(0)" there was an
"xcopy" command built in bat-script:

http://sdrv.ms/JqqVRL

and xcopy finished as it should:

http://sdrv.ms/JqrlYf

*SVNADMIN VERIFY*
Also, during "try4" and "try5(0)",before real job, "experimental repositories"
were verified with "svnadmin verify"(to be sure they are copied OK):
        
http://sdrv.ms/Jqrpav

In case if "rep-cache.db" is damaged, would "svnadmin verify" detect it?

*REP-CACHE.DB DIFF*
I've just compared "db\rep-cache.db" for "try5(0)" and "try5(1)"
(i.e. last broken vs first valid) and they are equal...

*COPY LOG DIFF*

The only interesting thing I've mentioned, when comparing logs for
"try5(0)" http://sdrv.ms/JqrlYf and
"try5(1)" http://sdrv.ms/1dlXwTv

is that in first case "db\rev-prop-atomics.mutex" was also copied.
http://sdrv.ms/1dlWV4o

Though, there's no such file either in source or in target dir now...
Its temporary, isn't it?

PS
Still, all files mentioned above are here: http://sdrv.ms/1jMN250

Best regards,
Varfolomeev Igor
Received on 2013-12-31 03:08:16 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.