On Jul 16, 2012, at 12:38 PM, Daniel Shahaf wrote:
> Trent Nelson wrote on Mon, Jul 16, 2012 at 08:58:09 -0700:
>> Somewhat related: is this a FreeBSD box?
>
> Yes, it's eris from http://www.apache.org/dev/machines.
>
>> ports/sysutils/mcelog is useful for getting info on any ECC errors
>> that might have occurred.
>
> Thanks for the pointer. The port description says: "The primary purpose
> is to provide a way to decode MCE output from the FreeBSD kernel into
> something more human-readable" --- how to get the "raw" MCE output? I
> don't see "mce" mentioned in `sysctl -a` or /var/log/messages.
Yeah it's definitely on the cryptic side. I'm dubious as to whether or
not the majority of features mentioned in the man page actually work.
From experience, I simply `pkg_add -r mcelog`'d and then ran `mcelog`
on a FreeBSD box of mine that looked like it had some wonky DIMMs. I
noticed a MCE line in the console log, so I ran `mcelog`, and wallah,
heaps of info about the error (the exact DIMM/slot was handy).
>
>> Is the repo living on ZFS?
>
> FreeBSD 8.2, zpool v15, zfs v4
Oh my, v15! Insert standard recommendation of upgrading to v28 (it
landed in 8-stable around July last year) -- huge improvements in the
latter over v15.
That being said, one of my production boxes (that happens to mirror asf,
incidentally), is still on 8.2-stable w/ v15. Not ideal, but meh, it's
production, and due for an upgrade in a few months.
>
>> Don't suppose you've got a non-standard vfs.zfs.txg.timeout (greater
>> than 5 seconds?) set? That could have exacerbated the situation.
>>
>
> We have the default setting, but the default is greater than 5 seconds:
>
> eris,0:/boot% sysctl -a | grep txg
> vfs.zfs.txg.write_limit_override: 0
> vfs.zfs.txg.synctime: 5
> vfs.zfs.txg.timeout: 30
>
Ah, looks like the timeout defaults to 30 on v15. It's been changed to
5 seconds on v28.
> Is there somewhere documentation of what the txg timeout _is_ (as
> opposed to what are the effects of changing the knob)? Or is the only
> documentation in the source tree?
It tells ZFS how long it's allowed to postpone writes to stable storage.
(Does Subversion explicitly request synchronous writes? Or do any sync/
fsync dances?)
The reason I thought of this is because of the timing that would have
had to play out for the bit flip to be plausible. It would have had to
have happened after the txn->rev dance, but before the data was written
to storage.
All of ZFS's bad-ass checksumming and data healing measures are useless
if the data becomes corrupted in RAM before it's written to disk; ECC is
the only defense for that. ECC's good at catching single bit errors;
not so good at catching anything more, so it's plausible for a bit flip
to go completely undetected, even with ECC -- but you should have at
least *one* 'corrected' entry in the mcelog (or on the console -- don't
suppose you log that to a file?).
Might be worth lowering vfs.zfs.txg.timeout down to, say, 5 seconds; at
least until an upgrade to v28 is viable.
Trent.
Received on 2012-07-16 19:40:38 CEST