On 3/15/07, Justin Erenkrantz <justin@erenkrantz.com> wrote:
>
> On 3/15/07, John Peacock <jpeacock@rowman.com> wrote:
> > Of course, knowing this, as a rule I never open large directories in
> > Explorer, but use a Command Prompt instead. It's still painfully slow
> > to get a directory here, because DIR insists on sorting the files
> > (rather than returning them in filesystem order).
>
> And, doing an ls in a directory 500k+ files in it even on Unix is no
> fun either. I think we're sort of straying from the point here - for
> those high-volume repositories (like Apache, etc.), sharding is a way
> to reduce inode exhaustion in directories - not eliminate the issues.
> 5k (just to keep it power of 10) seems like a good cut-off. 1k is far
> too small as apache.org is going to zoom by 1 million revs soon
> enough.
>
> So, in other words, I couldn't care less about what the folders look
> like on Win32 - to focus on that exclusively is to be beside the point
> - *most* serious large-scale repositories probably aren't going to be
> on Win32. They can, but then those admins aren't likely to be foolish
> enough to browse the directories with Explorer on a regular basis - I
> claim that we should assert that whomever is admining that large of a
> repository probably has a modicum of clue to understand what's going
> on here.
Would there be any real downside to my suggestion of using 2 levels? Have a
top level folder every 10,000 revisions and inside those folders break it up
on every 1,000. This makes it easy to find revisions, and breaks things up
enough to handle large repositories well and also be browsable.
--
Thanks
Mark Phippard
http://markphip.blogspot.com/
Received on Thu Mar 15 21:35:33 2007