[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Is my implementation too large for SVN?

From: Dave_Thomas mailing lists <davelist_at_peoplemerge.com>
Date: 2006-10-09 22:04:47 CEST

Hi, I'm a developer on the problem with John on this issue.

It turns out that the huge problem we have on hands is the 23,000 files in
6,300 dirs in the repo. That's why udpates take 20-60 minutes.

The minimum we can update right now is 4300 files. This is because this
contains an XML database that uses schema validation. Updated files could
point to new files not yet in the WC, which is why we need to pull that
subtree.

The bad news for us is that that 4300 files will increase to at least 16,000
in the next 3-5 months.

Currently the update on this subset takes 1 1/2 minutes (as long as no files
have changed).

It appears the biggest bottleneck is the seek time and # of hard drives.
Anyone have an alternative to buying 6 extra hard disks for every user or
expensive NAS systems?

Or could moving to subversion 1.4 speed things up?

Thanks,
Dave

On 10/7/06, Ryan Schmidt <subversion-2006d@ryandesign.com> wrote:
>
> On Oct 7, 2006, at 21:52, Troy Curtis Jr wrote:
>
> > On 10/6/06, Ryan Schmidt wrote:
> >
> >> On Oct 6, 2006, at 14:34, Ruslan Sivak wrote:
> >>
> >> > Also windows starts being very slow when you have a lot of entries
> >> > in a directory. I think you are hitting that problem. Perhaps in
> >> > the future SVN can be designed to keep a nested directory
> >> > structure, keeping no more then X revisions in a folder. Basically
> >> > something like this
> >> >
> >> > revs
> >> > 1/
> >> > 1
> >> > 2
> >> > 3
> >> > ...
> >> > 99
> >> > 100
> >> > 2/ 101
> >> > 102
> >> >
> >> > etc. I'm sure there's a better way to implement this though, but I
> >> > think this is definitely needed for large number of revisions on
> >> > windows.
> >>
> >> I thought I remembered someone months ago explaining that there is no
> >> performance issue, even on Windows. Something about how the only time
> >> you have this problem is when you need to get a directory listing,
> >> which Subversion does not need to do, because it already knows what
> >> the files are named, along with something about how the file system
> >> hashes based on the first 8 characters of the filename, so there's no
> >> issue until revision 100000000 which could take awhile to reach.
> >> Perhaps someone else will remember the discussion and can dig up the
> >> old message. I don't remember the specifics very well; I didn't pay
> >> too much attention because I do not use Windows, and I believe the
> >> issue was said to be nonexistent on other platforms.
> >
> > I disagree based on personal experience, on a NON-windows platform.
> > All those revs files just seemed to slow things down. Well, I say
> > that, but most of gain using bdb was that it had a full-copy of the
> > latest revision. FSFS had to run through some number of diffs to
> > contruct my checkout copy and that was probably what was slowing me
> > down. However, it is true when doing a hotcopy of the repository
> > (fsfs hotcopy: ~33minutes bdb hotcopy: ~6 minutes). Of course this is
> > a largely OS file copy action.
>
> That may be, I just don't think that dividing the FSFS revisions
> amongst multiple directories as the OP suggests will alleviate any of
> the performance issues you mention. I don't think it's a problem of
> having all the revision files in a single directory.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org
>
>
Received on Mon Oct 9 22:05:32 2006

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.