[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC] FSFS filesystem options (long, sorry)

From: Malcolm Rowe <malcolm-svn-dev_at_farside.org.uk>
Date: 2007-03-05 20:15:11 CET

On Mon, Mar 05, 2007 at 01:57:03PM -0500, Greg Hudson wrote:
> Ben asked me to chime in.
>

Thanks!

> On Mon, 2007-03-05 at 07:08 +0000, Malcolm Rowe wrote:
> > The first is the ability to split your revs/ and revprops/ directories
> > into separate 'buckets' or 'shards', so that we don't require a
> > repository with a million revs to contain a million files in one
> > directory.
>
> I think this is a relatively easy and quick hack, and I've commented in
> the past that I'd like to see it done at least to measure its
> performance impact. So I'd say go for it. I used to have an opinion on
> the exact format of the splitting, but I've long since forgotten what it
> is.

Yes, I'll try to measure the impact, though I doubt it'll show up for
small (< a hundred thousand revisions?) repositories, since they can
probably hold all the names in the equivalent of the dentry cache.

> Some of the subsequent discussion centered around whether we should make
> it the default (thus making 1.5 repositories unreadable by 1.4 backend
> code, and possibly breaking some hackish third-party tools which make
> assumptions about the current FSFS layout).

(Well, I'm assuming we'd bump the fs format number, so those tools
should really be checking :-))

> I'd suggest not making it
> the default in 1.5 unless it has a marked performance benefit on a
> repository of, say, 10,000 revs on a common filesystem. When 1.6 rolls
> around, making it the default would be less of an issue.

Why do you think that it'd be easier to make it the default for 1.6?
(I'm really not sure myself - I can see advantages to making it on- and
off-by-default). Because the hacky tools would have grown to understand
it?

> Note that changing from a sharded layout to a non-sharded layout does
> not necessarily require a dump and reload. We could create a tools/
> script to reorganize an existing repository as an offline (but quick)
> operation.

Sure. Offline is the key here, though, even if it is a quick operation.
(We probably _should_ write a script though).

> > For those people who are using a NAS to store the repository, FSFS
> > really really really sucks.
>
> I've had decent experience using it in AFS, which I'm guessing has a
> lower penalty for opening and closing files than NFS does (or at least,
> the implementation of NFS you're testing with).

Yes, seems to be caused by the close-to-open cache consistency required
of by NFS. It's possibly just the Linux implementation, but, well,
that's the common case where I'm looking.

> Your proposed change seems okay, but I'm not sure how successful it
> might be in a multi-vendor environment. The repository tells me to
> marshal transactions in /tmp, which works great on my Unix systems, but
> my Windows systems don't have a C:\tmp directory so it fails.

Agh, good point. People use file:// access over SMB too, and while that
wasn't the use case I was thinking of, there's no reason to make it not
work.

> Perhaps
> the option should just be a boolean "marshal transactions in temporary
> storage", so that Subversion can figure out the most appropriate
> temporary directory, worry about not stomping on other users on the same
> machine, etc.. Just an idea.

Yes, that sounds sensible. I suggested to lundblad the possibility of
putting it into <tmp>/svn.<uuid>/<txn>. I guess we'd have to make sure
we were careful about permissions, but that should work.

Regards,
Malcolm

  • application/pgp-signature attachment: stored
Received on Mon Mar 5 20:15:28 2007

This is an archived mail posted to the Subversion Dev mailing list.