[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Issue #1074: Need a svnadmin command to verify that the repository is not corrupted (Take 2)

From: John Szakmeister <john_at_szakmeister.net>
Date: 2003-08-05 01:35:45 CEST

On Monday 04 August 2003 10:40, cmpilato@collab.net wrote:
> John Szakmeister <john@szakmeister.net> writes:
> > Any recommendations? I can't say that I'm very familiar with Berkeley's
> > methodology, so I'm at a loss for any sort of suggestion, other than to
> > say that 'svnadmin dump' seems to work okay. :-) What does that code
> > path do differently than this technique (it calls svn_repos_dir_delta()
> > to iterate over the contents)? I've never seen an instance where it
> > fails.
>
> Actually, these days 'svnadmin dump' uses svn_repos_replay() (the new
> and improved fork of svn_repos_dir_delta). The reason we don't lock
> issues with this, as with most other filesystem functionalities, is
> because we use the Berkeley locks (which, for the purposes of this
> discussion, can be mapped to Subversion's trails) for a short time
> each -- we get into the database, find some smallish piece of data (like
> a directory entries list, the change records for a specific revision),
> and get outta there. So what happens is that we are calling
> begin_trail() and commit_trail() crazy number of times, but each trail
> is used for a relatively small task.
>
> Your patch, however, creates a trail at the start of processing, and
> we keep it open while we read every representation and string in the
> database -- an arbitrarily large dataset. (And I'm not very clear on
> this, but I think Berkeley locks things per-page).
>
> The downside is that the solution is nasty. It requires that you:

Given the above comments about the locking, I'd have to agree... it is a nasty
solution.

> 1. Lose the representations walker. Instead you are now doing all
> your work up in the main svn_fs_verify() command.
> 2. Use a trail to get the 'next-key' value from the repo.

How do I get 'next-key' value from the repo then? I mean, right now I'm
building a cursor into the repo, and close it when I'm done iterating. Or,
are you saying I should create a function to get called by retry_txn() that
will take a current key and return the next-key? Perhaps a NULL value for
the current key will mean to grab the first key and return it?

> 3. Call svn_fs__prev_key()* on that value (this will be "the current
> key"). 4. Try to fetch the representation at the current key. If you
> fail, no sweat.
> 5. But if you succeed, read the rep data (which checks the checksums).
> 6. If "the current key" != "0", goto step 3 and repeat, otherwise,
> you're done.
>
> So, at most, the only time you spend in a given trail is the time
> taken to fetch a single representation, or the time taken to read a
> chunk of data from it.
>
> Questions? Comments? Are you ready to throw up your hands and quit
> yet? (please say no -- you've been a terrific sport through this)
>
> :-)

Nah, I like SVN... I'm here to stay. :-)

Sounds easy enough. :-) Although, while we're talking about some of the pros
and cons over one method versus another, I have a slight issue with
something. For me, I'd like to know that when I type 'svnadmin verify' that
everything checks out okay. If we only walks the reps table and check
checksums, all we are doing is verifying that the data for a file is correct.
Technically, there could be a problem in the node representation and you
won't have a clue until you do an 'svnadmin dump' or try to access that
particular node revision.

> *svn_fs__prev_key() doesn't really exist, at least not in the trunk
> code. I wrote it (thinking I needed it) on the fs-schema-changes
> branch. You can snarf it from:
>
>
> http://svn.collab.net/repos/svn/branches/fs-schema-changes/subversion/libsv
>n_fs/key-gen.c
> http://svn.collab.net/repos/svn/branches/fs-schema-changes/subversion/libsv
>n_fs/key-gen.h

I'll take a look at this soon.

-John
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Aug 5 01:35:06 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.