RE: FSFS rep-cache validation

From: Bert Huijben <bert_at_qqmail.nl>
Date: Thu, 23 Jan 2014 12:17:05 +0100

> -----Original Message-----
> From: Philip Martin [mailto:philip.martin_at_wandisco.com]
> Sent: donderdag 23 januari 2014 11:55
> To: Julian Foad
> Cc: Philip Martin; dev_at_subversion.apache.org
> Subject: Re: FSFS rep-cache validation
>
> Julian Foad <julianfoad_at_btopenworld.com> writes:
>
> > I get the problem. By "store its own 'head' revision" I meant store
> > the maximum value of any referenced revision number -- in other words,
> > simply a substitute for having an index and querying the maximum value
> > in the index. If we update this value correctly then it would serve
> > the same purpose as an index (but maybe faster or maybe not). Like
> > this:
> >
> > Before adding the rep cache entries for a (recently) committed revision rX:
> >
> > if (max_referenced_rev >= X):
> > raise error
> > # caller should escalate the error or clean up the rep-cache
> > max_referenced_rev = X
> >
> > Before/during looking up a rep cache entry, when repository head is rY:
> >
> > if (max_referenced_rev >= Y):
> > raise error
> > # caller should escalate the error or clean up the rep-cache
> >
> > In a roll-back to revision Z:
> >
> > delete where rep_cache.revision > Z
> > max_referenced_rev = Z
> >
> > I suppose the risk involved in users failing to do the roll-back
> > correctly (in this case, failing to update max_referenced_rev) would
> > still be present which perhaps makes the index approach superior.
>
> That might work but how do we stop old code updating the cache and
> failing to update max_referenced_rev? We don't have any version number
> in the rep-cache schema so that is going to be ugly. Bump the FS
> format? Rename the rep_cache table?
>
> With the index we could have the new code create any missing index when
> opening the rep-cache, there would be a one-time delay the first time
> new code was used on a big cache. Old code would update the index but
> not do the validation check. The validation check done by the new code
> would include any rows added by old code.

-0.5 on adding an Sqlite index on something that shouldn't be supported in the first place.
Adding an index just places the performance penalty at all the users that never use it and should never use it, while I can't be bothered with a very slow commit (for a table scan) for those users that hand-edit their repository. Updating the index on every insert shouldn't be really expensive to maintain as revision numbers are continuously growing, but it is more disk anyway.

I would say these users should just delete the sqlite database (or update the sqlite file manually) in the case where they are changing the repository structure in unsupported ways.

So +1 on the suggestion of just removing the dead code.

Bert
>
> --
> Philip Martin | Subversion Committer
> WANdisco // *Non-Stop Data*
Received on 2014-01-23 12:17:52 CET

This message: [ Message body ]
Next message: Bert Huijben: "RE: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories"
Previous message: Philip Martin: "Re: FSFS rep-cache validation"
In reply to: Philip Martin: "Re: FSFS rep-cache validation"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]