[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Implement sharding for FSFS

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2007-04-06 11:49:33 CEST

On Fri, 2007-04-06 at 10:49 +0200, Norbert Unterberg wrote:
> Problem:
> FSFS can become slow because it creates too many entries in a single directory.

Actually, I haven't seen any results that measure a speedup in FSFS
performance from any kind of sharding, so this isn't necessarily a good
problem statement.

However, listing the directory (with anything from Windows Explorer to
"ls") or backing it up can become slow in some cases, and some
particularly unfortunate filesystems will simply refuse to store more
than some number of files in a directory.

>Constraint to fix this:
>Do not create more than 1000 entries per directory.

I don't think anyone stated that as a constraint. The change is to
create 1/1000 as many entries in the top level as there are revs, and no
more than 1000 entries in a subdir.

>If I understand your solution, you violate the constraint as soon as
>the repository reaches the revision 1,000,000 because it would create
>the 1001st entry in db/revs/.

The idea is that if you have more than a million revs, you're hopefully
running on a good filesystem and using good tools; even if you aren't,
whatever problem you're seeing is only 1/1000 as bad as it used to be.

Given how long it took for sharding to become enough of an itch to
scratch in the first place, I don't think it's worth the added
complexity to implement multi-level sharding.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Apr 6 11:49:05 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.