What are the potential performance & limit considerations for high number of files in repository?
Date: Wed, 29 Apr 2009 16:33:28 -0700 (PDT)
I would appreciate any advice regarding the following problem.
I'm considering using subversion as a data repository.
The usage patterns on the data are very similar to source code development. The data is textual in nature. Currently, there are about one million very small files (typically < 1KB each).And it is growing at a relatively slow rate (say maybe about million per year).
The usage flow will include branching and merging of sub-directories as well as the entire repository. There are about total of 20 users and they are all within the same LAN.
The nature of the data is such that each user will typically sync up or check out a folder with 500 files, work on it for 1-3 days and check it back in.
It is all on a Windows based environment (Win2K, Apache, svn 1.4.x)
1) What are the limitations on number of files in the repository (assuming I have sufficient hard-disk space of course & within NTFS limits)?
2) Are there any known potential performance bottlenecks/issues in such data repository organization (i.e. where are the potential slowdowns or performance concerns)?
3) My understanding from previous threads is that in terms of total size I'm well within the limits of the system (1-2 GB of data) so this not of a concern. Please correct me if I'm wrong.
4) Generally, is this a valid usage for subversion (in terms of number of files & size, assume development like usage pattern) and has anyone had experience with such repositories? In other words - is it a totally trivial & simple repository layout for subversion that's done everywhere...?
I'd appreciate the discussion or any advise greatly.
(N.B. I will have follow the thread on the web since I am not subscribed to the mailing list.)
To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_subversion.tigris.org].
This is an archived mail posted to the Subversion Users mailing list.