[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Sat, 23 May 2015 11:07:21 +0000

Evgeny Kotkov wrote on Thu, May 21, 2015 at 18:23:22 +0300:
> SVN_ERR(svn_repos_verify_fs2(repos, 1, 5, NULL, NULL, NULL, NULL, pool));
> ...would return two different errors depending on the binaries being used,
> assuming that one of the revisions in the [r1:r5] range is corrupted:
> (With 1.8.13) E160004: Checksum mismatch in item at offset 0 of
> length 59 bytes in file path/asf/db/revs/0/2
> (With 1.9.0-rc1) E165011: Repository 'path/asf' failed to verify
> Thoughts?

I think the incumbent code doesn't use SVN_ERR_REPOS_CORRUPT (E165011)

The keep-going implementation simply uses SVN_ERR_REPOS_CORRUPT to wrap
whatever error the FS layer reported. That's wrong, for two reasons:

- The SVN_ERR_REPOS_CORRUPT code should be used to indicate corruption
  of repos-layer on-disk data (such as $REPOS_DIR/db not existing or
  $REPOS_DIR/format being a directory). The code does not look for such
  conditions. In fact, since the repos layer got around to calling into
  the FS layer, the repos layer itself is almost certainly integral.

- The repos layer has no business second-guessing the FS layer's choice
  of error code. For example, a permission error on a rev file could
  happen and does not indicate corruption, or a user might have pressed
  ^C during svn_fs_verify()'s execution. The incumbent code assumes
  that any non-SVN_ERR_CANCELLED code indicates a corruption; that
  assumption is wrong.

So, the keep-going code should stop using SVN_ERR_REPOS_CORRUPT to
indiscriminately wrap svn_fs_verify()'s error code. That should make
the E160004 (SVN_ERR_FS_CORRUPT) error visible, addressing Evgeny's
issue. (I am almost sure that's exactly what Evgeny's patch does.)

Furthermore, I think the "if (found_corruption)" block at the end of
svn_repos_verify_fs3() should also avoid using SVN_ERR_REPOS_CORRUPT,
since that would be the wrong code to use in some circumstances (for
example, if all revision files had permission errors). The code should
either say what it _does_ know for a fact — "N out of M revisions failed
to verify" — or start inspecting the FS errors to determine whether they
indicate corruption or transient errors (which isn't always possible,
since the repos layer does not know the context the error was raised in).

In fact, since the aforementioned two uses of SVN_ERR_REPOS_CORRUPT are
the only two uses of that code, I conclude that I am of the opinion that
that error code should be deleted entirely from the 1.9 codebase. We
can always revive it in 1.10 if needed.


Received on 2015-05-23 13:07:31 CEST

This is an archived mail posted to the Subversion Dev mailing list.