[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC] FSFS filesystem options (long, sorry)

From: Malcolm Rowe <malcolm-svn-dev_at_farside.org.uk>
Date: 2007-03-05 10:24:57 CET

On Mon, Mar 05, 2007 at 12:09:01AM -0800, Karl Fogel wrote:
> Malcolm Rowe <malcolm-svn-dev@farside.org.uk> writes:
> > I've been thinking a bit recently about FSFS's scalability and
> > performance, and there are two non-backward-compatible things that I'd
> > like to be able to implement.
> >
> > The first is the ability to split your revs/ and revprops/ directories
> > into separate 'buckets' or 'shards', so that we don't require a
> > repository with a million revs to contain a million files in one
> > directory.
>
> Why would this be this non-backward-compatible? I can see why it's
> not forward-compatible, but backward-compatibility should be possible
> here.
>

Perhaps I meant forward-compatible - I always get those muddled. What I
meant is that an older client can't access a sharded repository. New
clients can access older repositories, of course.

> > local disk until it's time to do the final commit.
>
> Nice workaround!
>

Thanks! I think it's got the potential to really speed up commits
against NAS servers.

> > - Accept FSFS filesystem options at 'svnadmin create' time.
> > (perhaps in the cases above we'd name them --fsfs-max-files-per-dir=N
> > and --fsfs-local-txn-dir=/foo).
>
> Are you sure max-files-per-dir has to be configurable? If we're going
> to have the code in there anyway, would it be possible to just pick
> some reasonable values and then make this the new default for FSFS?
> (With the old format still supported, of course.) Are you sure it
> would be so much less efficient than the current storage mechanism
> that we need to retain the option of not sharding?
>

Well, I guess my woolly line of thinking goes like this:

- Most people probably don't need this feature. You'd need quite a lot
  of revisions to make it worthwhile.
- I do need some way to create repositories that older clients can access.
  If sharding is on by default, I could either use a
  --pre-1.5-compatible switch (or --fsfs-not-sharded), or, if it's off
  by default, something like --fsfs-sharded.
- I don't have the information to choose the most efficient size of
  directory. I could choose a reasonable value (4096? 10000?) and
  that'd be okay...
- ... but if I need per-filesystem options for the local-txn-storage
  idea anyway...
- ... then I could punt on making a decision and make the user specify
  the number of files is they want sharded storage...
- ... and if they don't know they want it, they don't get it, so our
  repositories are compatible with 1.4 by default.

I guess the things holding me back from making it the default are:
- it's inelegant if you have a decent filesystem.
- we still need to support both methods, it's just a metter of where
  the switch is located.
- I can't really set a policy for the right number of files myself.
- I quite like the idea of making it opt-in rather than opt-out, and
  of reusing the options mechanism to do so.

Regards,
Malcolm

  • application/pgp-signature attachment: stored
Received on Mon Mar 5 10:25:26 2007

This is an archived mail posted to the Subversion Dev mailing list.