[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Sharded FSFS repositories - summary

From: Mark Phippard <markphip_at_gmail.com>
Date: 2007-03-15 21:35:21 CET

On 3/15/07, Justin Erenkrantz <justin@erenkrantz.com> wrote:
>
> On 3/15/07, John Peacock <jpeacock@rowman.com> wrote:
> > Of course, knowing this, as a rule I never open large directories in
> > Explorer, but use a Command Prompt instead. It's still painfully slow
> > to get a directory here, because DIR insists on sorting the files
> > (rather than returning them in filesystem order).
>
> And, doing an ls in a directory 500k+ files in it even on Unix is no
> fun either. I think we're sort of straying from the point here - for
> those high-volume repositories (like Apache, etc.), sharding is a way
> to reduce inode exhaustion in directories - not eliminate the issues.
> 5k (just to keep it power of 10) seems like a good cut-off. 1k is far
> too small as apache.org is going to zoom by 1 million revs soon
> enough.
>
> So, in other words, I couldn't care less about what the folders look
> like on Win32 - to focus on that exclusively is to be beside the point
> - *most* serious large-scale repositories probably aren't going to be
> on Win32. They can, but then those admins aren't likely to be foolish
> enough to browse the directories with Explorer on a regular basis - I
> claim that we should assert that whomever is admining that large of a
> repository probably has a modicum of clue to understand what's going
> on here.

Would there be any real downside to my suggestion of using 2 levels? Have a
top level folder every 10,000 revisions and inside those folders break it up
on every 1,000. This makes it easy to find revisions, and breaks things up
enough to handle large repositories well and also be browsable.

-- 
Thanks
Mark Phippard
http://markphip.blogspot.com/
Received on Thu Mar 15 21:35:33 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.