[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Possible incompatibility of svn_repos_verify_fs2() in 1.9.0-rc1

From: Branko Čibej <brane_at_wandisco.com>
Date: Wed, 03 Jun 2015 16:19:48 +0200

On 03.06.2015 15:31, Ivan Zhakov wrote:
> On 3 June 2015 at 15:33, Branko Čibej <brane_at_wandisco.com> wrote:
>> On 21.05.2015 17:23, Evgeny Kotkov wrote:
>>> Subversion 1.9.0-rc1 introduced a new svnadmin verify --keep-going mode [1].
>>> In order to achieve this, we added a svn_repos_verify_fs3() API function and
>>> deprecated its predecessor, svn_repos_verify_fs2(), that now calls the newer
>>> function with keep_going argument set to FALSE.
>>>
>>> However, svn_repos_verify_fs2() behaves differently in 1.9.0-rc1 and 1.8.13
>>> when it comes to error reporting. As an example, the following code ...
>>>
>>> SVN_ERR(svn_repos_verify_fs2(repos, 1, 5, NULL, NULL, NULL, NULL, pool));
>>>
>>> ...would return two different errors depending on the binaries being used,
>>> assuming that one of the revisions in the [r1:r5] range is corrupted:
>>>
>>> (With 1.8.13) E160004: Checksum mismatch in item at offset 0 of
>>> length 59 bytes in file path/asf/db/revs/0/2
>>>
>>> (With 1.9.0-rc1) E165011: Repository 'path/asf' failed to verify
>>>
>>> Please note that the second error is generic, and that the actual information
>>> about the error is lost. Existing API users of svn_repos_verify_fs2() are
>>> going to lose an important bit of information about the root cause of the
>>> corruption unless they update the code. Furthermore, even if an API caller
>>> subscribes to notifications with a notify_func / notify_baton pair, it would
>>> still be necessary to update the code to handle svn_repos_notify_failure that
>>> did not exist before 1.9.
>>>
>>> I did not find any discussion on the matter or the corresponding entry in
>>> /notes/api-errata [2] that would describe this behavior change, so I am
>>> inclined to think that this is accidental and, probably, undesirable.
>>>
>>> There is an option of restoring the 1.8 behavior when svn_repos_verify_fs3()
>>> is called with keep_going set to FALSE. We could immediately yield the error
>>> in this case, and only use the notifications / generic error when keep_going
>>> is set to TRUE. Doing so would change the output of svnadmin verify without
>>> --keep-going, because now it wouldn't include the generic error message in
>>> the end. I attached a proof-of-concept patch, that doesn't change the test
>>> expectations and the documentation, i.e., it only includes the core change,
>>> mostly as an illustration to these points.
>>>
>>> Thoughts?
>>>
>>> [1] https://svn.apache.org/r1492651
>>> [2] https://svn.apache.org/repos/asf/subversion/trunk/notes/api-errata/1.9
>> I completed your patch and committed the fix in r1683311. Please review!
>>
> [returning discussion back to this thread]
>
> On 3 June 2015 at 16:26, Branko Čibej <brane_at_wandisco.com> wrote:
>> On 03.06.2015 14:57, Ivan Zhakov wrote:
>>> On 3 June 2015 at 15:35, Branko Čibej <brane_at_wandisco.com> wrote:
>>>> On 03.06.2015 14:24, Ivan Zhakov wrote:
>>>>> On 3 June 2015 at 14:37, Branko Čibej <brane_at_wandisco.com> wrote:
>>>>>> On 02.06.2015 20:05, Branko Čibej wrote:
>>>>>>> On 02.06.2015 12:45, Daniel Shahaf wrote:
>>>>>>>> Ben Reser wrote on Sun, May 31, 2015 at 14:28:39 -0700:
>>>>>>>>> The 1.9.0-rc2 release artifacts are now available for testing/signing.
>>>>>>>>> Please get the tarballs from
>>>>>>>>> https://dist.apache.org/repos/dist/dev/subversion
>>>>>>>>> and add your signatures there.
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>> Note that Evgeny reported a regression in svn_repos_verify_fs2() in
>>>>>>>> <http://svn.haxx.se/dev/archive-2015-05/0141.shtml>. No objections to
>>>>>>>> moving forward with rc2, but as that issue is a regression, we'll need
>>>>>>>> an rc3 that fixes it.
>>>>>>> Yes ... and the patch has been reviewed but not committed. I believe it
>>>>>>> only needs a couple tweaks (fixing an "if" condition and removing the
>>>>>>> unused error code).
>>>>>> I have a more complete fix based on Evgeny's patch, running tests now.
>>>>>> It turns out that we still need a new error code for the summary
>>>>>> results, but with a different meaning and therefore different name.
>>>>>> Renaming it had a positive side effect as it turned out that we were
>>>>>> emitting the SVN_ERR_REPOS_CORRUPTED error from FSFS and FSX instead of
>>>>>> the correct SVN_ERR_FS_CORRUPT.
>>>>>>
>>>>> Another option would be to require notify_func for
>>>>> svn_repos_verify_fs3() and always report errors through notify_func.
>>>>> This would make error reporting consistent whether keep_going TRUE or
>>>>> FALSE. For svn_repos_verify_fs2() we could create compat notify_func
>>>>> handler that converts notifications to errors.
>>>> See r1683311; I believe error reporting is consistent now.
>>>> Adding complex logic to the deprecated function isn't such a good idea, IMO.
>>> Maybe, but it's better than adding complex logic to current
>>> (not-deprecated) function. IMO it's not svn_repos_verify_fs3()
>>> responsibility to generate verification summary -- it should be done
>>> at the UI level. I.e. in svnadmin or other application using
>>> svn_repos_verify_fs3() API.
>> But that would imply either returning a bunch of extra info from
>> svn_repos_verify_fs3, or counting notifications in the callers (which
>> means making the callers depend on implementation details of the API).
>>
> I don't see problem provide wrapped notify func in
> svn_repos_verify_fs2() in call to svn_repos_verify_fs3(). Also we
> don't need to count notifications since svn_repos_verify_fs2() doesn't
> support keep_going flag.
>
>> Note that the summary will only be generated if an error occurred during
>> verification in keep-going mode, when we have to return some kind of
>> error anyway; we may as well put as much info as we have available into
>> the error message.
> That's what I call inconsistent API: the returned errors are different
> whether keep_going TRUE or FALSE.

Of course the errors are different, the mode of operation is different,
too. Keep-going mode drops underlying errors on purpose (after
notifying); it's been this way from the beginning, it makes perfect
sense and I didn't change that. What Evgeny and I fixed was the bug that
the keep_going=FALSE mode also dropped the underlying errors, which is
both wrong and inconsistent with 1.8.x. (And that keep_going=TRUE mode
ignored cancellations.)

> I'm also not sure that we have to
> return error if we already reported it via notify_func in
> svn_repos_verify_fs3().

Out notification mechanism cannot replace API consistency. When an API
call fails, it must return an svn_error_t; surely you're not proposing
that we make an exception for svn_repos_verify_fs3?

Since we do need an error, we have four choices:

  * Combine all the verification errors into one huge stack; this has
    two problems: one is the obvious unbounded memory usage, the other
    is that such a stack is both unmanageable and duplicates the
    notifications.
  * Return the last or first encountered error; whilst this would work,
    it's not very informative and possibly confusing because it singles
    out one problem out of possibly hundreds.
  * Return a generic "something went wrong" error.
  * Return a generic error but provide a summary of the problems in the
    error message.

IMO, only the last two options make any kind of sense; and of these, the
latter is better because it gives the user some extra info essentially
for free.

We could, as you propose, add the summarizing into 'svnadmin verify';
but that would create more code churn for what is essentially a bug fix.

-- Brane
Received on 2015-06-03 16:20:53 CEST

This is an archived mail posted to the Subversion Dev mailing list.