Hi all,
I’ve just ran into a weird bug which damaged my svn repository. I still
don’t understand what exactly was wrong, so, I don’t know how to
describe it in a clear and simple manner, sorry… I’ll just try to describe
all the symptoms I’ve experienced. I’ll use real file names, since I
wasn’t able to reproduce this bug on synthetic test repository.
*SETUP*
Most simple single-user, single-PC setup. Local repository.
First svn version: “Subversion command-line client, version 1.8.5.”.
Windows 7 x64
Antivirus: Kaspersky Endpoint Security 10
*THE STORY*
The story began, when I ran into some sort of error message, while
trying to commit r3349.
After a bit of struggling, I’ve realized, that my repository got broken
after previous commit (r3348). Nasty thing is that previous commit
finished without any error message.
*SYMPTOMS*
**svn verify**
Output ends like this:
<….>
* Verified revision 3346.
* Verified revision 3347.
svnadmin: E160004:
Corrupt node-revision '4d-610.2-2392.r3348/35659066'
svnadmin: E160004: Found malformed header '' in revision file
**svn checkout**
When I try to checkout a new working copy, I receive similar
message:
<…>
W:\testCO\Binar\Matlab\deploy
W:\testCO\Binar\Matlab\deploy\x64
W:\testCO\Binar\Matlab\deploy\x64\Binar_x64.prj
W:\testCO\Binar\Matlab\deploy\x64\Binar_x64
W:\testCO\Binar\Matlab\deploy\x64\Binar_x64\distrib
Corrupt node-revision '4d-610.2-2392.r3348/35659066'
Found malformed header '' in revision file
**svn Repository Browser**
When I navigate to
file:///V:/R_Matlab/Binar/trunk/Binar/Matlab/deploy/x64/Binar_x64
in tortoise svn repository browser, I see the same error message:
Corrupt node-revision '4d-610.2-2392.r3348/35659066'
Found malformed header '' in revision file
Here’s a screenshot: http://sdrv.ms/1fJVuwa
*ZEROS IN DATA FILE*
Luckily, I have a full backup (r3337). I’ve manually repeated all my
commits up to r3347 and verified that at this state repository is OK.
Next, I’ve tried to reproduce the bug:
1. Firstly (“try1”), I’ve repeated same Matlab commit script
(Matlab simply calls svn, just like from cmd). And… «success»
- same bug again!
2. Secondly (“try3”), I’ve managed to reproduce the bug using
only windows cmd commands.
3. Thirdly (“try4” and “try5(0)”), I wrote a bat-script to
reproduce the same actions.
I’ve compared
R_Matlab\db\revs\3\3348
file for different “tries”: (initial bug is designated as “try0”) and
discovered a single interesting thing:
each “3348” file has a long sequence of zero-bytes:
• try0: 0x2201B0A to 0x2201FFF
• try1: 0x2201000 to 0x2201FFF
o try0_vs_try1_p1: http://sdrv.ms/Ju7nev
o try0_vs_try1_p2: http://sdrv.ms/Ju7tmu
o try0_vs_try1_p3: http://sdrv.ms/Ju7AOI
• try3: 0x2201B11 to 0x2201FFF
o try0_vs_try3_p1: http://sdrv.ms/Ju7G9g
o try0_vs_try3_p2: http://sdrv.ms/Ju7HKd
• try4: 0x2201000 to 0x2201FFF
o try0_vs_try4_p1: http://sdrv.ms/Ju7OFE
o try0_vs_try4_p2: http://sdrv.ms/Ju86MJ
o try0_vs_try4_p3: http://sdrv.ms/Ju89ID
• try5(0): 0x2201000 to 0x2201FFF (just like try4).
o try0_vs_try5(0)_p1: http://sdrv.ms/1daKwjG
o try0_vs_try5(0)_p2: http://sdrv.ms/1daKxUx
o try0_vs_try5(0)_p3: http://sdrv.ms/Ju8iM5
Moreover, try4 and try5 have only one single difference, two zero-
bytes, starting from 0x21F9FFE (in case of “try5(0)”):
http://sdrv.ms/19jmBdm
*BUG DISAPPERED*
That’s all I have. 5 broken repositories. After that bug DISAPPEARED.
Just like a UFO :) . I’ve launched the SAME script, with the SAME
input data 10 more times (“try5(1)”,”try5(2)”…) – nothing – svn
correctly commits r3348, resulting repository is valid:
• svn verify is OK
• I’m able to see contents of
“R_Matlab/Binar/trunk/Binar/Matlab/deploy/x64/Binar_x64”
in tortoise svn repository browser
• svn checkout is OK.
When I compare “revs\3348” for “try4” vs “try5(1)” the ONLY
difference is those long sequence of zero-bytes mentioned before:
• try4_vs_try5(1)_p1: http://sdrv.ms/1edmEdV
• try4_vs_try5(1)_p2: http://sdrv.ms/Ju8YkC
*REPRODUCTION SCRIPT*
The bat script, that resulted in error is quite straightforward. It simply
copies several files. It might be not a good idea to copy modified file
without committing it first, but still it should not result in error… The
bat file (used in try4) is here: http://sdrv.ms/19ld4FN
Another thing to mention is that size of files in 3348 commit is about
250 Mbytes….
To my shame, my repository is both large (~30GB) and containing
confidential data, so, I’m unable to share it :( .
All files mentioned above are in this folder: http://sdrv.ms/1jMN250
*LOKING FOR SIMILAR CASES*
Mainly, I’ve just googled “svn: Corrupt node-revision”. It looks like
this error message is quite common, but no one tried to understand
it’s source. Though, there’s a “what was that?” question
in [1](see link below).
Moreover, it looks like no one experienced “repetitive” behavior…
In some cases, issue was resolved by restoring revision files from
backup[1], or using svn dump/load [3,4]. In one report [2],
julian.foad <at> wandisco.com was using John Szakmeister's
'fsfsverify.py' to analyze corruption. Though, it looks like in his case,
corruption type was quite different. In one post [4], VinnyJames
said: “we've seen this happen during heavy load”.
1.
http://www.wandisco.com/svnforum/threads/38519-Commit-errors-Revision-files-corrupted
2. http://thread.gmane.org/gmane.comp.version-control.subversion.devel/123110
3.
http://stackoverflow.com/questions/5543285/how-do-i-fix-a-repository-with-one-broken-revision
4.
http://dev-notes-to-self.blogspot.com/2009/01/fixing-corrupt-subversion-repository.html?showComment=1280529811361#c6899551059356251422
*QUESTIONS*
So….
1. What was that? Any ideas? May it happen again?
2. Any other interesting diagnostic info I can get from these
repositories?
3. Should I re-post this to subversion mailing list also? Or is it,
most probably, dependent on tortoise somehow?
Say, due to some caching?
*PS*
I’ve already posted the text above on tortoise svn mailing list:
http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=4061&dsMessageId=3070808
and received a suggestion to re-post it here:
http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=4061&dsMessageId=3070843
*PPS*
I’m not subscribed and would appreciate being explicitly Cc:ed in any responses.
Received on 2013-12-29 23:01:16 CET