[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: FSFS rep-cache validation

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Thu, 23 Jan 2014 09:30:35 +0000 (GMT)

Philip Martin wrote:

> On IRC today we were discussing the validation code in
> svn_fs_fs__set_rep_reference.  When inserting a row into the rep-cache
> the insert can fail because the row already exists.  The usual way this
> would happen is that two parallel commits add content with the
> same checksum at different locations, e.g.:
>
> svn import some_file file://`pwd`/repo/f1
> svn import some_file file://`pwd`/repo/f2
>
> That's harmless and nothing goes wrong, one of the commits populates the
> rep-cache.
>
> Another problem is that the only caller of svn_fs_fs__set_rep_reference
> always passes reject_dup=FALSE so there is code that is never
> exercised.  Perhaps we should remove it?

Ugh -- the doc string says if REJECT_DUP is TRUE, "return an error if there is an existing match for REP->CHECKSUM" but the implementation does something else: return an error if an existing match points to a different rep (that is, a different rev/index/size), but return successfully when an existing match points to the same rep.

The caller says:

      /* FALSE because we don't care if another parallel commit happened to
       * collide with us.  (Non-parallel collisions will not be detected.) */
      SVN_ERR(svn_fs_fs__set_rep_reference(fs, rep, FALSE, scratch_pool));

I'm not sure that comment makes sense either -- maybe I am just not reading it the right way.

> A more serious problem is if somebody edits a repository and ends up
> with a rep-cache that has references to revisions that do not exist.
> This could happen if somebody removes revision files and edits 'current'
> but doesn't run 'recover'.  Or if somebody uses an old backup with a
> newer rep-cache.  We have attempted to catch this by having
> svn_fs_fs__get_rep_reference return an error if the rep-cache refers to
> future revisions. However the code isn't working:
>
> $ svnadmin create repo --compatible-version 1.8
> $ svnmucc -mm -U file://`pwd`/repo put repo/format f put repo/README.txt g
> $ chmod +w -R repo
> $ rm -f repo/db/revs/0/1* repo/db/revprops/0/1*
> $ echo 0 > repo/db/current
> $ svnmucc -mm -U file://`pwd`/repo put repo/README.txt g
> $ svnadmin verify repo
> * Verifying metadata at revision 0 ...
> * Verifying repository metadata ...
> svnadmin: E165011: No representation found at offset 276 for item 4 in revision
> 1
> svnadmin: E165011: Repository 'repo' failed to verify
>
> Despite the verify error the new r1 is correct represented, it doesn't
> use the erroneous row from the rep-cache, and "svn cat" will work.
> However the commit also means that the erroneous row no longer refers to
> a revision in the future and so a subsequent commit may use the row and
> be corrupt.
>
> The problem is that svn_fs_fs__get_rep_reference calls
> svn_fs_fs__ensure_revision_exists which returns the error
> SVN_ERR_FS_NO_SUCH_REVISION while the caller,
> transaction.c:get_shared_rep, expects the error SVN_ERR_FS_CORRUPT.
>
> We could change the error returned by svn_fs_fs__get_rep_reference or
> the error expected by get_shared_rep.  Either of those would cause the
> second commit above to fail.  However that doesn't really solve the
> problem.  When the erroneous row is present we really need to stop any
> commit that would make the erroneous row refer to a revision even if
> that commit itself doesn't use the row:
>
> $ svnadmin create repo --compatible-version 1.8
> $ svnmucc -mm -U file://`pwd`/repo put repo/format f put repo/README.txt g
> $ chmod +w -R repo
> $ rm -f repo/db/revs/0/1* repo/db/revprops/0/1*
> $ echo 0 > repo/db/current
> $ svn -mm mkdir file://`pwd`/repo/X
> $ svnmucc -mm -U file://`pwd`/repo put repo/README.txt g
>
> Perhaps commit should always check for the highest revision in the
> rep-cache and fail if it is greater than HEAD?  This might be considered
> overkill to help people who have introduced corruption by editing the
> repository.  However when such an erroneous row is present the failure
> mode is nasty: a commit succeeds but is corrupt and content is lost.
> The difficult bit here is that we do not have an SQLite index on
> rep_cache.revision.

My take on this would be...

In an ideal world, we might like to say that the entire repository is a black box and should only be touched by the correct tools, and then it works as intended.

In the real world, we don't provide sufficient tools for all the situations that occur and people do touch the insides. Therefore we should try to design the insides as nicely behaved components -- separable, editable, robust.

(We don't even provide a "roll back to revision X" tool. You hint above that editing 'current' and then running 'svnadmin recover' achieves this. Perhaps we should add an explicit command.)

A minimal way to make the rep cache robust against roll-back might be to store its own "head" revision in it (in a separate table I suppose), and check this against the repository head revision before looking up any reference in it and before adding reps for a new revision. I don't know if that would be more or less efficient than using an index on rep_cache.revision.

- Julian
Received on 2014-01-23 10:34:13 CET

This is an archived mail posted to the Subversion Dev mailing list.