[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: upgrade_tests.py 29 spurious failure while testing 1.7.17

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Tue, 13 May 2014 09:43:43 +0200

On Mon, May 12, 2014 at 9:51 PM, Ben Reser <ben_at_reser.org> wrote:
> On 5/5/14, 4:24 PM, Johan Corveleyn wrote:
>> As always, I tested with Windows XP (it's end of life, I know ...
>> whatever) on a ramdisk, non-parellel.
>>
>> This time I took a copy of repository and working copy before
>> rerunning the test :-). See attachment. Can anyone shed some light on
>> this?
>>
>> I experimented a bit further with a copy of the repository and working
>> copy of this failed test:
>> - svnadmin verify says everything is ok.
>> - a new svn checkout over svn:// works fine.
>> - executing the failing "svn up" command (the last command of the
>> failure output) on that particular working copy, talking with that
>> particular repository over svn:// ... no problem.
>>
>> So I'm at a loss here. I don't see any corruption, yet the test reported it.
>>
>> Perhaps some kind of cache corruption is a possibility? A theory would
>> be nice ... anything really.
>
> I strongly suspect there is something wrong with your machine (memory going
> bad?). The repository is nothing more than a dump/load from the greek tree.
> After a dump/load the repository has the UUID set. No other modifications
> happen to the repo and the only access to the repository via the server is the
> update command that's failing. That rules out a problem with caching because
> the cache should be entirely cold for this repository when the update command runs.
>
> The error you're getting is:
> svn: E160004: Corrupt node-revision '0.0.r1/4198'
> svn: E160004: Missing id field in node-rev
>
> The closest id in the repository is this: 0.0.r1/4206
>
> The number after the slash is the offset which is stored in the private portion
> of the svn_fs_id_t. The offset is stored as a apr_off_t (i.e. not a string but
> a integer).
>
> Looking at the offsets in binary yields (leading zeros ommitted):
>
> 4206 = 1 0000 0110 1110
> 4198 = 1 0000 0110 0110
>
> Note that they are off by exactly bit.
>
> A memory issue would probably be very hard to reproduce. So this seems to fit
> with the issues you've been having. Combine that with the fact that you've
> been having unreproducible test failures in other places with this setup. I
> have to conclude you have issues with your memory. I'd suggest running
> memtest86 on the machine.

First, thanks a lot for taking a look and giving a plausible
explanation. It's a possibility, but I'm not fully convinced yet :-).

Pro:
- It fits theoretically (the one bit off etc).
- It's the only explanation so far. And IIUC cache corruption is ruled out.
- The machine is getting old (almost 8 years now -- I think the memory
is 5 or 6 years old). Its operating system (WinXP) is EOL.

Con:
- I've had zero stability issues with my machine so far. No crashes,
no bluescreens. Not one for as far as I can remember.
- I've been testing / signing svn releases for a couple of years. No
problem, until the last two release cycles or so.
- Ran memtest86 (version 4.0.0 that I still had on some boot CD) last
night. It ran for 8 hours. No errors.

So either my machine really has a memory problem, or it's a unique
machine that can (rarely) reproduce a bug in Subversion. I'm still not
sure. If it's the latter it would be a waste to throw it in the trash
:-). OTOH, if it's such a rare issue that nobody else is seeing this,
maybe it's not worth further precious time (of me and you and others)
...

I'll continue pounding it a bit more, but I'll probably give up at
some point (not determined yet).

-- 
Johan
Received on 2014-05-13 09:51:11 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.