[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Implement svnadmin verify --keep-going

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Fri, 4 Jan 2013 15:03:11 +0000 (GMT)

Thanks for this version, Prabhu.  It looks much better.  Still a few more points...

Prabhu Gnana Sundar <prabhugs_at_collab.net> wrote:
> On 12/20/2012 11:25 PM, Julian Foad wrote:
>> The output for a failed revision depends on whether --keep-going was
>> passed.  With --keep-going you print a "* Error verifying revision 2."
>> line; without it you don't.
>>
>> Consistency of the output is important, but even more important is
>> simplicity of the code.  I would expect the code to look something like
>> (pseudo-code):
>>
>>     for R in range:
>>       err = verify_rev(R)
>>       if (err)
>>         show the error and a 'Failed' notification
>>         had_error = TRUE
>>         if not keep_going:
>>           break
>>
>>     if had_error:
>>       return a 'Failed to verify the repo' error
>>
>> but you seem to have several different code paths. Why do you have two
>> different places that generate the "Repository '%s' failed to verify"
>> message; can you do it in just one place?
>
> Yeah Julian, now the code generates the "Repository '%s' failed to
> verify" message only in one place, thanks for the idea.

I can still see this error message being returned in two different places in your patch.  (Maybe there were three before; I haven't checked.)  But it's much better now because they are both in the same function and  at the same level, and that's fine.  I suggest you leave it like that.

In the first of these places:

>    /* Verify global/auxiliary data and backend-specific data first. */
> -  SVN_ERR(svn_fs_verify(svn_fs_path(fs, pool), cancel_func, cancel_baton,
> -                        start_rev, end_rev, pool));
> +  if (err && !keep_going)
> +    {
> +      rev = SVN_INVALID_REVNUM;
> +      found_corruption = TRUE;

(Redundant assignment as you're about to return -- remove it.)

> +      notify_verification_error(rev, err, notify_func, notify_baton,
> +                                  iterpool);
> +      svn_error_clear(err);
> +      return svn_error_createf(SVN_ERR_REPOS_CORRUPTED, NULL,
> +                               _("Repository '%s' failed to verify"),
> +                               svn_dirent_local_style(..., pool));
> +    }
> +  else
> +    {
> +      if (err)
> +        found_corruption = TRUE;
> +      svn_error_clear(err);
> +    }

Why only notify the error if we're stopping but not if we're keeping going?  That seems wrong.

Further on, in the main loop:

> +      if (err && keep_going)
> +        {
> +          found_corruption = TRUE;
> +          notify_verification_error(rev, err, notify_func, notify_baton,
> +                                    iterpool);
> +          svn_error_clear(err);
> +          continue;
> +        }
> +      else
> +        if (err)
> +          {
> +            found_corruption = TRUE;
> +            notify_verification_error(rev, err, notify_func, notify_baton,
> +                                      iterpool);
> +            svn_error_clear(err);
> +            break;
> +          }

Is there any difference between these the "if" branch and the "else" branch apart from the last line?  Simplify to:

   if (err)
     ...
     if (keep_going)
       continue;
     else
       break;

>> Another justification for always printing the "* Error verifying
>> revision 2." line is that with the old output:
>>
>> [[[
>> $ svnadmin verify repo
>> ...
>> * Verified revision 15921.
>> * Verified revision 15922.
>> * Verified revision 15923.
>> svnadmin: E160004: Invalid change kind in rev file
>> ]]]
>>
>> it can be a little confusing at first glance to realize that the error
>> relates to a revision number that was not mentioned -- revision 15924 in
>> this example.  I have often in the past wished that we would print a
>> notification line like "* Error verifying revision 15924." here.
>>
>> Also, in email [1] I suggested that the final summary error should be
>> printed even when keep-going is false.  Did you consider that suggestion?
>> What do you think?
>
> Yeah I have taken that into consideration and implemented it as well.

Great -- thanks.

>> A more serious problem: it doesn't always report the error at the end.
>>
>> [[[
>> $ bin/svnadmin verify --keep-going repo2
>> * Verified revision 0.
>> * Verified revision 1.
>> * Error verifying revision 2.
>> [...]
>> svnadmin: E160004: Invalid change kind in rev file
>> * Verified revision 3.
>> ]]]
>>
>> The exit code is 0 here.  Clearly a bug.  The repository I used for this
>> test is attached as 'jaf-corrupt-repo-2.tgz'.
>
> Ahh, I missed out a code when moving my code from main.c to svnadmin.c (sorry
> about that). This works fine now.

Good.

>> I am now sure that you need more tests.  You added a test, but can you tell
>> us what it tests in functional terms?  It looks like it tests verifying a
>> repo where there is an error in r6, and the youngest rev is r6.  That doesn't
>> cover the main purpose of the 'keep going' feature, which is to keep
>> going, does it?
>>
>> Can you think of other things that should be tested?  Please write a short
>> list of tests that we should ideally have -- even if we might not write them
>> all.  It might start with something like:
>>
>>     Scenario 1:
>>       Repo: youngest=r3, an error in r3.
>>       Command: svnadmin verify --keep-going repo
>>     Expected results:
>>       stdout: Nothing
>>       stderr: Success messages for r0, r1, r2.
>>               A 'revision failed' message for r3.
>>               At least one error message relating to r3.
>>       exit code: non-zero.
>
> Scenario 2:
>   Repo: youngest=r3, an error in r2
>   Command: svnadmin verify repo
> Expected results:
>   stdout: nothing
>   stderr: Success for r1
>               Revision failed message for r2
>               Error message relating to r2
>               Repository %s failed to verify message
>   exit code: non-zero
>
>
> Scenario 3:
>   Repo: youngest=r3, an error in r2
>   Command: svnadmin verify repo --keep-going
> Expected results:
>   stdout: nothing
>   stderr: Success for r1
>               Revision failed message for r2
>               Error message relating to r2
>               Success for r3
>               Repository %s failed to verify message
>   exit code: non-zero
>
> Scenario 4:
>   Repo: youngest=r3, an error in r3
>   Command: svnadmin verify repo -r1:2
> Expected results:
>   stdout: nothing
>   stderr: Success for r1, r2
>   exit code: zero
>
>
> Scenario 5:
>   Repo: youngest=r3, an error in r3
>   Command: svnadmin verify repo -r1:2 --keep-going
> Expected results:
>   stdout: nothing
>   stderr: Success for r1, r2
>   exit code: zero

Great.

So, have you tested any or all of these scenarios by hand?

Do you plan to update the test you wrote to cover any of these scenarios?  I'm not saying you have to automate these, but it would be nice, and the one test that you have included still doesn't actually test that the verify keeps going, so that one really does need to be improved.

>> In svn_repos.h: svn_repos_verify_fs3():
>>
>>   /**
>>    * Verify the contents of the file system in @a repos.
>>    *
>>    * If @a feedback_stream is not @c NULL, write feedback to it (lines of
>>    * the form "* Verified revision %ld\n").
>>
>> This mentions what feedback the function gives.  You are adding new
>> feedback, so please update the doc string.  It must be out of date
>> already  because the function doesn't have a 'feedback_stream' parameter,
>> so  please fix that at the same time.
>
> Updated the doc string...

> + * If @a notify_func is not @c NULL, write feedback to it (lines of  the
> + * form "* Verified revision %ld\n". If an error/corruption is  found,
> + * the feedback is of the form "* Error verifying revision %ld\n").

That's not accurate, as we don't write lines of text to the callback function, we pass it an
object that contains an 'action' enumerator and a revision number and other values.

>>    * If @a start_rev is #SVN_INVALID_REVNUM, then start verifying at
>>    * revision 0.  If @a end_rev is #SVN_INVALID_REVNUM, then verify
>>    * through the @c HEAD revision.
>>    *
>>    * For every verified revision call @a notify_func with @a rev set to
>>    * the verified revision and @a warning_text @c NULL. For warnings call
>>    * @a notify_func with @a warning_text set.
>>
>> Is that paragraph still correct or does it need updating?
>
> Updated the paragraph...

You haven't changed that paragraph.  What are these warnings that it refers to?  Thatdoesn't seem right.  You have added another paragraph:

> + * For every revision verification failure call notify_verification_error
> + * with @a rev set to the corrupt revision.

A doc string should be written in terms of the interface to the function.  The interface to this function doesn't know about the existence of the function notify_verification_error().

>> + * If @a keep_going is @c TRUE, the verify process notifies the error
>> + * message and continues. If @a notify_func is @c NULL, the verification
>> + * failure is not notified.
>>
>> Please document what this function returns (as I asked in [2]).
>
> Yeah sure, documented the changes

Thanks.  You added:

> + * [...]  Upon successful completion of verification, return
> + * SVN_NO_ERROR else return the corresponding failure.

Let's change this to: "Finally, return an error if there were any failures during verification, or SVN_NO_ERROR if there were no failures."

>> + * @since New in 1.8.
>> + */
>> +svn_error_t *
>> +svn_repos_verify_fs3(svn_repos_t *repos, ...);
>>
>>
>> In subversion/libsvn_repos/dump.c: svn_repos_verify_fs3():
>>
>> You are doing error detection and handling in two places in this function
>> out of more than two possible places where an error could be thrown.  As I
>> wrote in [3], "Instead of checking for errors in several places within the
>> main loop,  put the verification code in a wrapper function and check for
>> errors exactly once where you call that function.  That way you won't miss
>> any.  In your last patch, there is at least one call that could return a
>> verification error that isn't checked -- the call to
>> svn_fs_revision_root()."
>
> Yeah I get that, sounds a good idea :)
> Now used a new wrapper function verification_checks() to perform all the error
> checks, thus catching any possible error in one shot.

Great.  It needs the 'static' keyword and it needs a doc string.  In a case like this, the doc string could just say something very simple like "Verify revision REV in file system FS." as the rest of the parameters are pretty standard.  "verify_one_revision" might be a better name for the function.  The pool parameter should be renamed to 'scratch_pool'.

Thanks.

- Julian
Received on 2013-01-04 16:03:53 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.