First theory for explaining both kinds of corruption Re: Fwd: [Daniel Shahaf: Long-standing corruption on svn.apache.org]
From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Sun, 2 Oct 2011 22:28:33 +0200
tldr: The first part of this mail lists a few more instances. The
--- I've run 'svn log -ql 7000' for the following fspaths and repositories: asf:/hadoop asf:/openejb asf:/james asf:/jackrabbit asf:/karaf asf:/archiva asf:/hbase asf:/cxf asf:/tomcat asf:/incubator asf:/subversion infra:/websites infra:/websites/production/www and I found two more instances: r931481 in ^/jackrabbit and r1136942 in ^/cxf. [[[ r891679 | julianfoad | 2009-12-17 12:48:09 +0000 (Thu, 17 Dec 2009) r891677 | stylesen | 2009-12-17 12:47:52 +0000 (Thu, 17 Dec 2009) r891672 | stylesen | 2009-12-17 12:30:43 +0000 (Thu, 17 Dec 2009) r965497 | rhuijben | 2010-07-19 14:26:54 +0000 (Mon, 19 Jul 2010) r965496 | cmpilato | 2010-07-19 14:26:50 +0000 (Mon, 19 Jul 2010) r965495 | artagnon | 2010-07-19 14:26:43 +0000 (Mon, 19 Jul 2010) r931481 | mduerig | 2010-04-07 09:41:50 +0000 (Wed, 07 Apr 2010) r931480 | jukka | 2010-04-07 09:41:48 +0000 (Wed, 07 Apr 2010) r931479 | jukka | 2010-04-07 09:36:17 +0000 (Wed, 07 Apr 2010) r1136942 | dkulp | 2011-06-17 17:11:26 +0000 (Fri, 17 Jun 2011) r1136941 | dkulp | 2011-06-17 17:11:25 +0000 (Fri, 17 Jun 2011) r1136938 | dkulp | 2011-06-17 17:01:54 +0000 (Fri, 17 Jun 2011) ]]] In all cases, eris jumps from the youngest revision of the triplet to the oldest, while harmonia gives all three revisions of the triplet, during a 'svn log -ql file://$REPOS_ROOT/$tlp' run. --- I also found a ridiculously bogus instance in one of the minfo-cnt blowup revisions from my previous email: the noderev of ^/@r908653 thinks its predecessor is 0.0.r908626/17893... --- Analysis: it seems that in all cases, there is a relatively small time gap between the two youngest revisions in the triplet. In two cases, namely r891679 and r908653, the corruption of predecessors is accompanied by corruption of minfo-cnt. However, r1136942 is not accompanied by a similar corruption, nor is it preceded by a O(100) files commit. Theory: there are two independent bugs: one corrupts the predecessors when two commits are made in quick succession, and another causes the minfo-cnt values to become corrupt when a commit quickly follows a O(100)-file commit --- i.e., in other words, follows a commit that would have triggered issue #3506 (also known as INFRA-2261), a failure to update rep-cache.db. The first issue doesn't trigger on harmonia because it doesn't have cache disks in its zpool, and the second doesn't because the svnsync process that commits to harmonia's repositories takes an out-of-band lock to ensure that at most one svnsync process runs at any given moment.Received on 2011-10-02 22:29:27 CEST
This is an archived mail posted to the Subversion Dev mailing list.