[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Checkout really slow in Windows with lots of files in one directory

From: Nick <nospam_at_codesniffer.com>
Date: Thu, 03 Feb 2011 23:34:39 -0500

On Wed, 2011-02-02 at 07:52 -0500, Mark Phippard wrote:
> On Wed, Feb 2, 2011 at 7:41 AM, Geoff Rowell <geoff.rowell_at_gmail.com> wrote:
> > On Wed, Feb 2, 2011 at 4:09 AM, Nick <nospam_at_codesniffer.com> wrote:
> >> On Tue, 2011-02-01 at 13:00 -0500, Mark Phippard wrote:
> >>
> >> On Wed, Jan 26, 2011 at 9:28 AM, Neil Bird <neil_at_jibbyjobby.co.uk> wrote:
> >>
> >>> We have a graphics-oriented code-base that's auto-generated and has >5000
> >>> source files in one directory. While I can check this out OK on Linux,
> >>> we're seeing an unusable slow-down on Windows XP (NTFS), both using
> >>> Tortoise
> >>> directly, and as a test on Linux with the Windows drive mapped over CIFS.
> >>
> >> I created a folder with 5001 files in it ... maybe that is not enough?
> >> I just used small simple text files as I was only checking for the
> >> general problem in managing the temp files and the WC metadata.
> >>
> >> Upon checkout (using 1.6.15 command line client) I did not notice any
> >> slowdown. Windows checked out via HTTP across internet in about 49
> >> seconds as opposed to 33 from my Mac (which is a faster system). The
> >> main thing is checkout did not seem to slow down.
> >>
> >> I did a similar test, using 5100 files in a single directory. Each file
> >> contained only the content "file XXXX" where XXXX was the number of the file
> >> (so tiny files). My linux system took 17 seconds, while Windows took a bit
> >> less than 2 min (but Windows is virtualized while linux is on the
> >> hardware). I also did not notice a slow-down as the checkout proceeded.
> >> Both systems used 1.6.15 and accessed the repo via https.
> >>
> >> I did, however, notice that the time to *add* the files (done via svn add
> >> *.txt) seemed to progressively slow down. But this was only observed by
> >> watching the files in the console as they were being added (it was
> >> relatively easy to see the rate because the each file name had a linear
> >> number at the end). I don't have any timings to back this up, though I'll
> >> collect some if anyone's interested.
> >>
> > I don't know why, but I believe the key thing here is working with
> > *binary* files.
> >
> > I noticed the same problem with a massive (10K+) amount of audio
> > snippets in a single directory.
>
> I was thinking that this was a case where the reading/parsing/writing
> of our large entries file was causing a slowdown and moving to SQLite
> was going to bring performance gains. Clearly that is not the case as
> trunk is much slower.
>
> If I get another batch of free time I will try it with a lot of small PNG's.

I repeated my test of checking out a repo w/ 5100 files, but this time
using binary files (192 byte PNGs). 1m 7sec on linux, 6 min on Windows
(again, virtualized). On windows, it went fairly quick through all the
files, and then sat for several minutes after listing the last file and
completing the command.

Time taken listing all files on Windows during checkout: ~3 min
csrss - started at 70%-80% CPU, declined to < 20% CPU by the end of the
checkout
svn.exe - inverse of csrss (took remaining CPU to 100%)

After file listing, and before command completes: ~3 min.
During this time, Windows (virtualized) took ~10% CPU of the host OS,
and the 'svn' EXE was occasionally taking 2%-10%--the guest OS was
predominantly idle.

Nick
Received on 2011-02-04 05:35:19 CET

This is an archived mail posted to the Subversion Users mailing list.