[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: faster client pre-1.0: decrease number of files&folders in .svn

From: solo turn <soloturn99_at_yahoo.com>
Date: 2003-02-05 19:58:32 CET

--- Philip Martin <philip@codematters.co.uk> wrote:
> > from a typical use case, properties are not big, they are empty.
> > therefor your concerns are theoretically ok, but in practice not
> > (yet?) relevant.
>
> Your argument appears to be "I don't do it, so nobody else should
> either". There only needs to be one large property value on one
> file
> in a directory for there to be a real problem.
do you have a real world example? btw, i did not exclude the
possibility to apply the same approach as databases do: store a blob
seaparately and hold just the reference and maybe a checksum in the
main table (entries file). or use bdb instead of .svn text files.

> > and we noticed slow down just by the number of files. we noticed
> slow
> > down in disk usage, not memory usage, not processor usage.
>
> "Slow down"? To what are you comparing it?
how it scales with the number of files/folders managed. this is not
linear. its minimum 2 times slower (ie 10 times more files managed,
20 times more time for "svn st").

> > doing an svn up on 20.000 files/folders just blocks the disk for
> 2
> > minutes.
>
> "svn up" has to read every entries file, and you want to make the
> files larger. How does that help?

> > just deleting a svn working copy blocks the disk cause of the
> huge
> > number of files.
> >
> > and the number is increasing by factor 10 (--> 20.000
> files/folders,
> > 200.000 files/folders with the .svn folders).
>
> Do you have any evidence that it will be faster to use fewer,
> larger,
> files? I can see that deleting a working copy might be faster, but
> is
> it that really a more important use case than other svn operations?

its just a guess which comes out of the experienced slow-down /
non-linearity in 3 cases:
1. scalability of svn up and svn st:
     our experience states O(>2), while we believe it
     should be O(1). you may be partially right, because
     it is not O(10) like the number of files suggest.
     but i don't know how many .svn entries are
     really touched for one file in the repository.
     --> shows it is non-linear
2. working copy on network drive
     unusable, same behaviour like copying a folder
     with many small files to a network drive compared
     to copy one of the same size with less files.
     it seems that opening/closing/creating/removing
     files is the determining factor.
     --> shows that it is the local thing, not the
         server or the network connection svn server
         to svn client
3. same scalability of ls including the .svn and svn up:
     date && ls -1Ra | wc && date
     date && svn st && date
     --> shows similar graph just by listing the contents

the only non-linear thing in svn i know is the number of
administrative entries in .svn working copy. this is the case for
huge directories (where you also know the problem), but our
experience shows it as general (we don't have huge directories, but
we just have many directories with a few files in it).

so the chances are huge, that this is the originary cause of the
scalability problem.

__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Feb 5 19:59:18 2003

This is an archived mail posted to the Subversion Dev mailing list.