[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Sharded FSFS repositories - summary

From: John Szakmeister <john_at_szakmeister.net>
Date: 2007-03-15 11:34:53 CET

John Peacock wrote:
> Malcolm Rowe wrote:
>> - We'll create shards of 4000 entries each. That's large enough that
>> someone would have to hit 16M revisions before a larger value would be
>> an improvement, but small enough that it has reasonable performance
>> (and support) on all filesystems that I'm aware of. It's also a
>> power-of-ten, so easier for humans to understand.
>
> I have to say that I find "revs/N/12345 where N = 12345/constant" to be most
> human unfriendly, where N isn't an actual power of 10. I can't divide large
> numbers by 4000 in my head, but I could if it were 1000. I'm also concerned
> about the performance characteristics of NTFS (in particular) which seems to
> degrade much more quickly (to the point where I find it hard to even get a
> directory of the parent folder when a child folder has thousands of entries).

I actually did this for NTFS (under VMware) using a python script. I
posted my results earlier in this thread:
   http://svn.haxx.se/dev/archive-2007-03/0134.shtml

FWIW, NTFS scales very badly. Reiser and HFS+ required several orders
of magnitude more entries to reach a similar wait time. So, while I
hate to say it, NTFS and FAT should probably the file systems that guide
most of this decision.

-John

PS The ThreadFind link wasn't able to find any of the message ids I put
in... is it actually updating?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Mar 15 11:35:13 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.