[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: FSFS rep-cache validation

From: Philip Martin <philip.martin_at_wandisco.com>
Date: Thu, 23 Jan 2014 09:58:28 +0000

Julian Foad <julianfoad_at_btopenworld.com> writes:

> Ugh -- the doc string says if REJECT_DUP is TRUE, "return an error if
> there is an existing match for REP->CHECKSUM" but the implementation
> does something else: return an error if an existing match points to a
> different rep (that is, a different rev/index/size), but return
> successfully when an existing match points to the same rep.
>
> The caller says:
>
>       /* FALSE because we don't care if another parallel commit happened to
>        * collide with us.  (Non-parallel collisions will not be detected.) */
>       SVN_ERR(svn_fs_fs__set_rep_reference(fs, rep, FALSE, scratch_pool));
>
> I'm not sure that comment makes sense either -- maybe I am just not
> reading it the right way.

I'm not really sure what that code is supposed to do, but since it is
not used I'm inclined to delete it.

>> Perhaps commit should always check for the highest revision in the
>> rep-cache and fail if it is greater than HEAD?  This might be considered
>> overkill to help people who have introduced corruption by editing the
>> repository.  However when such an erroneous row is present the failure
>> mode is nasty: a commit succeeds but is corrupt and content is lost.
>> The difficult bit here is that we do not have an SQLite index on
>> rep_cache.revision.
>
> A minimal way to make the rep cache robust against roll-back might be
> to store its own "head" revision in it (in a separate table I
> suppose), and check this against the repository head revision before
> looking up any reference in it and before adding reps for a new
> revision. I don't know if that would be more or less efficient than
> using an index on rep_cache.revision.

I don't think that works. The problem is an erroneous row that refers
to a revision beyond HEAD. We already have code that stops such a row
being used for commits, but not using the row is not enough. The reason
that recover removes rows beyond HEAD is that it is not safe to make any
commits when such rows are present. Allowing commits eventually makes
the row refer to revisons that are present and then using the rows leads
to data loss.

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*
Received on 2014-01-23 10:59:06 CET

This is an archived mail posted to the Subversion Dev mailing list.