[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn up known issue?

From: jason marshall <jdmarshall_at_gmail.com>
Date: 2005-07-26 23:18:08 CEST

Ben,

  Thanks for the reply. Before I respond to your comments, I wanted
to reiterate that my primary concern is that svn up clean up after
itself if a network error happens at certain 'clean' stages of the
update process, such as before the server is first contacted. The
rest is largely incidental.

On 7/26/05, Ben Collins-Sussman <sussman@collab.net> wrote:
>
> On Jul 26, 2005, at 1:06 PM, jason marshall wrote:
> >
> > We are retrieving files via svn+ssh, from an FSFS repository on our
> > LAN. 'svn up' takes a good ten minutes (and 60M of memory) to run
> > before it finally asks for my password. What it's doing during this
> > time period, I cannot say,
>
> It's crawling over every directory in your working-copy tree, loading
> the .svn/entries file into memory, then describing the revision of
> each file to the server. It's also dropping a lockfile into
> each .svn/ area, to prevent other svn clients from changing the
> working copy.
>
> Once that's all done, the server has a 'snapshot' of your working-
> copy which it then compares to the latest revision. The server then
> sends back a tree-delta which the client applies.

Okay, and this is the point at which it figures out that it can't
contact the server, yes?

> > If I should unsurprisingly get bored or step away
> > from my computer, I may miss this prompt. At this point the clock is
> > ticking before the connection is lost, and by the time I type in my
> > password, it may be too late.
>
> That's ssh prompting you, not svn. There must be a parameter
> somewhere to tell sshd not to time out. (Subversion certainly has
> that for its own prompting.)

Right, I appreciate that. This was by way of introduction to the
steps leading up to the actual problem...

> > In this case, I get a "Connection
> > closed unexpectedly" error, and running 'svn up' again tells me that:
> >
>
> The ssh daemon cut you off.
>

Which is that SVN leaves its toys on the proverbial stairs. I suspect
that the same would happen if the server was unreachable.

> > svn: Working copy '.' locked
> > svn: run 'svn cleanup' to remove locks (type 'svn help cleanup' for
> > details)

> Hm, if the connection is forcibly closed in the middle of an update
> (or any network error happens), I think the subversion client could
> do better. It should be able to notice and do 'cleanup' itself.

I would think that the reason that teardown takes twice as long as
setup in this case is that 'cleanup' does substantially more than
simply removing locks from the directory tree, right? If so it would
certainly be helpful if svn could take some shortcuts at this point to
tidy up efficiently. If not, then I would much appreciate it if the
cleanup code could be made a bit more efficient.

> > In which case svn cleanup can take another 20 minutes to run
>
> And that's because it's going through each .svn/ area again, removing
> all the lockfiles it left behind.
>
> All that said, you've got many cards stacked against you here.
> You're in the worst of all possible worlds:

Indeed.

> * the svn client does a *lot* of processing of small files in
> the .svn/ area. NTFS is particularly slow at this sort of thing. On
> average, disk-intensive svn client operations typically take 2x as
> long on Windows than on Unix.

> * you have ridiculously large working copy. Nearly every common
> svn client command -- 'update', 'commit', 'status', etc. -- involves
> recursively walking the working copy to discover things. (Update
> needs to discover the version of each file; commit and status need
> to discover which files have changed.)

We do have a ridiculously large working copy, but that's by pure
coincidence. It so happens that we hope to use Subversion to more
simply extract a good number of dead subtrees. I'm confident that, as
you say, our performance will be greatly improved by doing so.

However, as code trees go, this is not by any means a ridiculously
large one. What is in our tree at this moment may be ridiculous, but
I would consider our code tree to be only slightly larger and more
complex than reasonable for a serious, longterm software project.
There will be many companies out there with code trees within a factor
of two of this one. Older companies are likely to have even bigger
trees.

Even Batik, which we used at one point in our testing, is about a
fifth of the size of our tree. How big is the Linux Kernel? JBoss?
Gnome? The Enterprise management system at any multinational company?

> The best answer here is: "Don't Do That."

"Doctor, it hurts when I bend over!"

I'm sure you're trying to be helpful, and I'm sure these suggestions
will make me more productive, but I hope you realise that this is, at
best, a bandaid. I shouldn't have to bend over backward to keep my
tools from exploding in my face. If the manual labor of using SVN is
truly so high, then my recommendation will be to pay the licensing fee
for Perforce, as the total cost/benefit is lower, and the in-house
expertise is slightly better (a couple of developers, versus none).

> Work with smaller pieces of the working copy. Either check out smaller subsets of the
> project, or be *selective* about how you run these commands. Don't
> just run 'svn up' from the top of Mount Everest; cd down into some
> sub-area that you're working on, and run update from there. (Or run
> 'svn up some\sub\area\'.) This will also stop the client from using
> so much memory when it caches 5.89824e37 .svn/entries files.

I have always considered one of the biggest benefits of SCM software
to be as an aid to repeatability. That two developers, across space
and time, can be more or less assured that they should be seeing the
same bug- and feature-set in their code is what allows us to keep
progressing despite increasing team sizes. Cherry-picking subtrees
from a large body of code does little to nothing for repeatability.

I encourage the SVN team to reassess their definition of 'reasonable
working set', and consider if the tool as currently presented can meet
reasonable performance metrics given that definition.

Thank you,
-Jason

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Wed Jul 27 00:54:06 2005

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.