[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Sharded FSFS repositories - summary

From: Michael Sinz <Michael.Sinz_at_sinz.org>
Date: 2007-03-15 21:49:50 CET

Mark Phippard wrote:
> On 3/15/07, *Justin Erenkrantz* <justin@erenkrantz.com
> <mailto:justin@erenkrantz.com>> wrote:
>
> On 3/15/07, John Peacock <jpeacock@rowman.com
> <mailto:jpeacock@rowman.com>> wrote:
> > Of course, knowing this, as a rule I never open large directories in
> > Explorer, but use a Command Prompt instead. It's still painfully
> slow
> > to get a directory here, because DIR insists on sorting the files
> > (rather than returning them in filesystem order).
>
> And, doing an ls in a directory 500k+ files in it even on Unix is no
> fun either. I think we're sort of straying from the point here - for
> those high-volume repositories (like Apache, etc.), sharding is a way
> to reduce inode exhaustion in directories - not eliminate the issues.
> 5k (just to keep it power of 10) seems like a good cut-off. 1k is far
> too small as apache.org <http://apache.org> is going to zoom by 1
> million revs soon
> enough.
>
> So, in other words, I couldn't care less about what the folders look
> like on Win32 - to focus on that exclusively is to be beside the point
> - *most* serious large-scale repositories probably aren't going to be
> on Win32. They can, but then those admins aren't likely to be foolish
> enough to browse the directories with Explorer on a regular basis - I
> claim that we should assert that whomever is admining that large of a
> repository probably has a modicum of clue to understand what's going
> on here.
>
>
> Would there be any real downside to my suggestion of using 2 levels?
> Have a top level folder every 10,000 revisions and inside those folders
> break it up on every 1,000. This makes it easy to find revisions, and
> breaks things up enough to handle large repositories well and also be
> browsable.

Why only every 10,000 (only 10 sub-folders?) I would say, do two levels
with 1,000/1,000,000 split. (or 100/10,000 split - which gets you 1 million
revs with no directory over 100 actual entries)

-- 
Michael Sinz                     Technology and Engineering Director/Consultant
"Starting Startups"                                mailto:michael.sinz@sinz.org
My place on the web                            http://www.sinz.org/Michael.Sinz
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Mar 15 21:50:32 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.