[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: What are the potential performance & limit considerations for high number of files in repository?

From: Mark Phippard <markphip_at_gmail.com>
Date: Thu, 30 Apr 2009 09:43:02 -0400

On Wed, Apr 29, 2009 at 7:33 PM, <webpost_at_tigris.org> wrote:

> I would appreciate any advice regarding the following problem.
>
>
> Context:
> --------
>
> I'm considering using subversion as a data repository.
>
> The usage patterns on the data are very similar to source code development. The data is textual in nature. Currently, there are about one million very small files (typically < 1KB each).And it is growing at a relatively slow rate (say maybe about million per year).
>
> The usage flow will include branching and merging of sub-directories as well as the entire repository. There are about total of 20 users and they are all within the same LAN.
>
> The nature of the data is such that each user will typically sync up or check out a folder with 500 files, work on it for 1-3 days and check it back in.
>
> It is all on a Windows based environment (Win2K, Apache, svn 1.4.x)
>
>
> Questions:
> ----------
>
> 1) What are the limitations on number of files in the repository (assuming I have sufficient hard-disk space of course & within NTFS limits)?
>
> 2) Are there any known potential performance bottlenecks/issues in such data repository organization (i.e. where are the potential slowdowns or performance concerns)?
>
> 3) My understanding from previous threads is that in terms of total size I'm well within the limits of the system (1-2 GB of data) so this not of a concern. Please correct me if I'm wrong.
>
> 4) Generally, is this a valid usage for subversion (in terms of number of files & size, assume development like usage pattern) and has anyone had experience with such repositories? In other words - is it a totally trivial & simple repository layout for subversion that's done everywhere...?
>
>
> I'd appreciate the discussion or any advise greatly.
>
> Eyal

I do not think you have to worry about anything based on this pattern.
 Just think how many files are likely in the entire Apache repository?

The one issue that has been observed is that you can start running
into problems when you have a lot of files in a single folder. The
more you "shard" your files into sub-directories, the less likely you
will have problems. If you start putting 10K+ files in one folder,
there are some negative consequences that start to happen.

-- 
Thanks
Mark Phippard
http://markphip.blogspot.com/
------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1995271
To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_subversion.tigris.org].
Received on 2009-04-30 15:44:18 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.